Director, AI Alignment and Interpretability (Remote)

CrowdStrike Holdings, Inc.$195K — $290K *
US-AnywhereRemote in United States
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • MS or PhD in machine learning, computer science, or related field focused on interpretability or AI alignment.
  • 8+ years of experience in ML research/engineering, specifically in interpretability or alignment of large language models.
  • Hands-on expertise with mechanistic interpretability methods applied to real models, not just theoretical knowledge.
  • Proven experience in designing and executing alignment evaluations with rigorous methodologies.
  • Demonstrated track record of leading and mentoring researchers while maintaining active technical contributions.

Responsibilities

  • Own and prioritize the alignment and interpretability research agenda for security-domain AI.
  • Develop methods for detecting offensive misuse signals within model internals.
  • Create an evaluation framework that ensures model behavior aligns with intended operational bounds.
  • Contribute original research in interpretability to scientific publications and community engagement.
  • Recruit, train, and retain a high-performing team of research scientists.

Benefits

  • Market leader in compensation and equity awards
  • Comprehensive physical and mental wellness programs
  • Competitive vacation and holidays for recharge
  • Paid parental and adoption leaves
  • Professional development opportunities for all employees
  • Employee Networks and volunteer opportunities to build connections
  • Vibrant office culture with world-class amenities
  • Great Place to Work Certified™ globally
Full Job Description
About the Role:

Security-domain AI creates alignment and interpretability challenges without good answers in the existing literature. A model trained on offensive techniques, vulnerability research, and proprietary threat telemetry develops internal representations that matter in ways general-purpose models do not. Understanding what that model knows, how it represents threat concepts, and where its behavior could diverge from intent is the research this role exists to do. Most of it hasn't been figured out yet.

In this role, you'll lead alignment and interpretability research for CrowdStrike's security-domain AI systems. You'll build methods for reading model internals: identifying features and representations tied to offensive security concepts, detecting misuse signal, and closing the gap between what a model is trained to do and what it actually does. You'll translate those findings into training interventions, behavioral constraints, and evaluation protocols that give the team real confidence in how these models behave. This is hands-on research leadership.

The team is lean and the problem space is novel. The right candidate has deep grounding in mechanistic interpretability or a closely related field, clear instincts about what questions matter in a security context, and the ability to advance the state of the art in a space the field is still forming.

What You'll Do:
  • Own the alignment and interpretability research agenda for security-domain AI. Set priorities, personally lead the hardest open problems, and develop methods that explain model behavior mechanistically: not just what models do, but why, and what that implies at the edges of their training distribution.
  • Build and apply techniques for detecting offensive-misuse signal in model internals, including probing for latent representations of vulnerability knowledge, circuit analysis to understand how security-relevant capabilities are encoded, and activation analysis to surface risk that behavioral testing alone would miss. Work closely with the adversarial evaluation team to close the loop between what they find in testing and what you find in the weights.
  • Develop alignment methodology for security-domain AI and own the evaluation framework that makes it measurable. This includes behavioral constraints, training interventions grounded in interpretability findings, deployment guardrails, and the benchmarks and tests that give the team confidence that models operate within intended bounds as a demonstrated property, not an assertion.
  • Contribute original research through publications and external engagement. Interpretability for security-specialized models is understudied. Publishing this work is part of the job.
  • Recruit, develop, and retain a lean team of research scientists. Set a technical bar through your own contributions, not just your expectations.


What You'll Need:
  • MS or PhD in machine learning, computer science, or a related field, with research depth in interpretability, AI alignment, or a closely adjacent area.
  • 8+ years in ML research or engineering, with direct experience doing interpretability or alignment research on large language models.
  • Hands-on expertise with mechanistic interpretability methods (probing classifiers, circuit analysis, activation patching, causal tracing, feature visualization) applied to real models. You've done this work, not just reviewed it.
  • Experience designing and running alignment evaluations: behavioral testing, capability elicitation, red-lining, or similar methodologies rigorous enough to support meaningful safety claims.
  • Track record of leading and growing researchers while remaining an active technical contributor yourself.


Ways to Stand Out:
  • Background in offensive security, vulnerability research, or adversarial ML, with enough depth to recognize what you find in model internals and reason about misuse potential.
  • Published research in mechanistic interpretability, AI alignment, or AI safety.
  • Experience applying interpretability methods to domain-specialized or fine-tuned models, not only general-purpose foundation models.
  • Familiarity with alignment challenges specific to models with dual-use capability: systems that understand and can reason about offensive techniques, and what that means for responsible deployment.
  • History of working closely with adversarial evaluation or red teams, using behavioral findings to motivate internal analysis and vice versa.


#LI-JF1

#LI-Remote

Benefits of Working at CrowdStrike:
  • Market leader in compensation and equity awards
  • Comprehensive physical and mental wellness programs
  • Competitive vacation and holidays for recharge
  • Paid parental and adoption leaves
  • Professional development opportunities for all employees regardless of level or role
  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
  • Vibrant office culture with world class amenities
  • Great Place to Work Certified™ across the globe


The base salary range for this position for all U.S. candidates is $195,000 - $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.

For detailed information about the U.S. benefits package, please click here.

Expected Close Date of Job Posting is:08-11-2026

About CrowdStrike Holdings, Inc.

CrowdStrike Holdings, Inc. Careers

Joining CrowdStrike Holdings, Inc. presents an unparalleled opportunity to advance a career in the tech industry with a company at the forefront of digital security. As a leader in cybersecurity solutions, CrowdStrike Holdings, Inc. offers a range of job opportunities that cater to a variety of skills and experiences, from entry-level positions to senior leadership roles.

Explore Job Opportunities

CrowdStrike Holdings, Inc. is continuously seeking talented individuals who are passionate about protecting organizations against cyber threats. With a commitment to innovation and excellence, the company is hiring professionals who are eager to contribute to a team that values hard work and creative solutions.

Innovation and Professional Growth

At CrowdStrike Holdings, Inc., employees are encouraged to push the boundaries of technology and leadership. The company supports professional growth through robust training programs, including leadership development and diversity training, ensuring that every team member has the resources to thrive in their career.

Culture and Benefits

The culture at CrowdStrike Holdings, Inc. is dynamic and inclusive, fostering a workplace where diversity is celebrated and every voice is heard. Employees enjoy comprehensive benefits that support both their professional and personal lives, enhancing job satisfaction and team morale.

Internship Programs

For those starting their career, CrowdStrike Holdings, Inc. offers internship programs that provide a rich learning environment. Interns gain hands-on experience, working alongside seasoned professionals and participating in projects that deliver real-world solutions.

Networking and Career Advancement

CrowdStrike Holdings, Inc. emphasizes the importance of networking within the industry, offering numerous opportunities for employees to connect with thought leaders and innovators. These connections can lead to career advancement and a deeper understanding of the cybersecurity landscape.

Applying for a Position

To apply for a position at CrowdStrike Holdings, Inc., candidates should prepare a resume that highlights relevant experience and skills. The interview process is designed to assess not only professional qualifications but also a candidate's fit within the company culture and team.

Stay Connected with CrowdStrike Careers

Interested candidates can stay informed about new openings and company news by subscribing to job alert emails. This personalized service ensures that potential applicants are the first to know about new opportunities that match their career interests and skills.

Join the Team

CrowdStrike Holdings, Inc. is looking for curious, creative, and solution-driven team players. Explore the employment opportunities on the CrowdStrike Holdings, Inc. careers page to find a position that matches your skills and passions.

SEARCH CROWDSTRIKE JOBS

Keep Up to Date

Stay ahead with career tips, insider perspectives, and industry-leading insights you can put to use today—all from the professionals who work at CrowdStrike Holdings, Inc.

READ CAREERS BLOG

Job Alert Emails

Customize your subscription to receive job alerts, latest news, and insider tips tailored to your preferences. Discover the exciting and rewarding career opportunities waiting at CrowdStrike Holdings, Inc.
Learn more about CrowdStrike Holdings, Inc.

Similar Jobs

More Jobs at CrowdStrike Holdings, Inc.

More Information Technology Jobs

Find similar Director, AI Alignment and Interpretability (Remote) jobs: