Director, Model Post-Training and Agentic Research (Remote)

CrowdStrike Holdings, Inc.$195K — $290K *
US-AnywhereRemote in United States
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • MS or PhD in computer science, machine learning, or a related quantitative discipline.
  • 8+ years of experience in ML research or engineering with expertise in large language model post-training.
  • Hands-on experience with SFT data pipelines and RL algorithms related to language model training.
  • Proven track record of designing agentic system harnesses for language model agents.
  • Strong skills in creating robust evaluation protocols for interpreting model capability improvement.
  • Experience managing high-velocity research programs with fast iterations.
  • Ability to lead and develop research teams while contributing technically.

Responsibilities

  • Drive the complete post-training pipeline for security-focused AI including reward modeling and RLHF.
  • Design training environments that simulate realistic cyber workflows and develop agent harnesses for trained models.
  • Create and oversee evaluation methodologies that assess both agent behavior and tool-use reliability in real-world security tasks.
  • Collaborate with teams to integrate post-training work with model development and contribute research publications.
  • Recruit and mentor top talent in research and machine learning to maintain high technical standards.

Benefits

  • Market leader in compensation and equity awards
  • Comprehensive physical and mental wellness programs
  • Paid parental and adoption leaves
  • Professional development opportunities regardless of role
  • Employee networks and volunteer opportunities
  • Vibrant office culture with top-tier amenities
  • Great Place to Work Certified™ globally
Full Job Description
About the Role:

The security domain presents one of the richest and most consequential training signal environments in applied AI. It's adversarial by nature, grounded in real operational outcomes, and evolving faster than any static benchmark can capture. We're building the post-training and reinforcement learning capability to build the latest models and harnesses into security-specialized systems that reason, plan, and act across complex cyber workflows. The person leading this work will be in the research, not just directing it.

In this role, you'll own the full post-training stack for security-domain AI (e.g., supervised fine-tuning, reward modeling, RLHF and RLAIF pipelines, and agent-RL environments) and the agentic research that sits on top of it. That means designing, building, and evaluating the harnesses that security agents actually run on (e.g., the scaffolding, tool-use interfaces, planning loops, memory and context management, and multi-step execution frameworks) that determine whether a trained model can operate reliably on complex security tasks. Post-training and agent architecture are not separable problems in this work. The reward signal you design has to reflect what the harness can measure, and the harness has to be built to surface what training needs to optimize. You'll set the technical direction on both, and you'll be in the work on both.

You'll lead a team of research scientists and engineers, but the team will look to your own work as the standard. The successful candidate shapes research priorities, keeps the team moving at high velocity across multiple training cycles per year, and elevates the quality of work by staying close enough to it to know what good actually looks like.

What You'll Do:
  • Own and personally drive the full post-training pipeline for security-domain AI - SFT, RLHF/RLAIF, agent-RL, and reward modeling. Set research priorities and architectural direction, and lead experimental work on the hardest problems yourself rather than delegating them away. Design reward modeling methodology grounded in verified security outcomes rather than proxy signals, drawing on both human expert feedback and automated adversarial evaluation. Define data curation standards across sourcing, filtering, quality scoring, and domain weighting that drive measurable capability improvement.
  • Build and maintain agent-RL training environments that simulate realistic cyber workflows (multi-step offensive and defensive tasks, tool use, and long-horizon planning) contributing directly to environment design and reward shaping. Lead the design and build of the agent harnesses that run on top of those trained models: scaffolding architecture, tool-calling interfaces, planning and reasoning loops, and memory and context management. Treat harness design with the same rigor as the training pipeline; these systems determine whether strong post-training translates into reliable, trustworthy behavior in the field.
  • Develop and own evaluation methodology for the full agentic stack, not model capability in isolation, but harness behavior, tool-use reliability, planning coherence, and end-to-end task completion across realistic security workflows. Define the benchmarks, red-line tests, and measurement practices that give the team and the organization genuine confidence that an agent works.
  • Partner closely with other teams to ensure post-training and agentic work integrates cleanly with the broader model development loop. Contribute original research through publications, external presentations, and open-source artifacts where appropriate, building CrowdStrike's credibility as a research-first organization in this space.
  • Recruit, develop, and retain a high-density team of research scientists and ML engineers. Set a technical bar through your own contributions, not just your standards.


What You'll Need:
  • MS or PhD in computer science, machine learning, or a related quantitative discipline.
  • 8+ years of experience in ML research or engineering, with meaningful depth in large language model post-training.
  • Hands-on expertise across the modern post-training stack, including SFT data pipelines, RLHF/RLAIF, PPO or similar RL algorithms applied to language models, and reward model design and training. This means you've done the work, not managed people who have.
  • Demonstrated experience designing or building agentic system harnesses for LLM-based agents, including tool-use frameworks, planning scaffolds, multi-step execution environments, and context or memory management. You've built these systems, not just used them.
  • Strong evaluation instincts: experience designing evaluation protocols that are resistant to overfitting, capable of measuring genuine capability improvement, and interpretable to both technical and non-technical stakeholders.
  • Track record of running high-velocity research programs with disciplined tracking and fast iteration.
  • Proven ability to lead and grow research teams while remaining a credible, active technical contributor.


Ways to Stand Out:
  • Demonstrated experience building or operating RL training environments for language model agents, including environment design, rollout infrastructure, and reward shaping.
  • Experience applying post-training or RL techniques in security, adversarial ML, or other high-stakes operational domains where ground truth is expensive and noisy.
  • Deep hands-on experience with agent harness architecture applied to long-horizon, multi-step task environments where reliability and failure modes matter as much as peak capability.
  • Background designing synthetic data pipelines or simulation environments for agent training in complex, tool-using workflows.
  • Familiarity with the offensive or defensive security practitioner's workflow - penetration testing, detection engineering, incident response, or threat intelligence - sufficient to reason about what good model behavior looks like in practice.
  • Published research in post-training, RLHF, RL for language agents, or related areas at top-tier venues (NeurIPS, ICML, ICLR, ACL, or equivalent).
  • Experience working on and adapting open-weight base models (Llama-class, Qwen-class, or similar) for domain-specialized continued training and fine-tuning.


#LI-JF1

#LI-Remote

Benefits of Working at CrowdStrike:
  • Market leader in compensation and equity awards
  • Comprehensive physical and mental wellness programs
  • Competitive vacation and holidays for recharge
  • Paid parental and adoption leaves
  • Professional development opportunities for all employees regardless of level or role
  • Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
  • Vibrant office culture with world class amenities
  • Great Place to Work Certified™ across the globe


The base salary range for this position for all U.S. candidates is $195,000 - $290,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.

For detailed information about the U.S. benefits package, please click here.

Expected Close Date of Job Posting is:08-11-2026

About CrowdStrike Holdings, Inc.

CrowdStrike Holdings, Inc. Careers

Joining CrowdStrike Holdings, Inc. presents an unparalleled opportunity to advance a career in the tech industry with a company at the forefront of digital security. As a leader in cybersecurity solutions, CrowdStrike Holdings, Inc. offers a range of job opportunities that cater to a variety of skills and experiences, from entry-level positions to senior leadership roles.

Explore Job Opportunities

CrowdStrike Holdings, Inc. is continuously seeking talented individuals who are passionate about protecting organizations against cyber threats. With a commitment to innovation and excellence, the company is hiring professionals who are eager to contribute to a team that values hard work and creative solutions.

Innovation and Professional Growth

At CrowdStrike Holdings, Inc., employees are encouraged to push the boundaries of technology and leadership. The company supports professional growth through robust training programs, including leadership development and diversity training, ensuring that every team member has the resources to thrive in their career.

Culture and Benefits

The culture at CrowdStrike Holdings, Inc. is dynamic and inclusive, fostering a workplace where diversity is celebrated and every voice is heard. Employees enjoy comprehensive benefits that support both their professional and personal lives, enhancing job satisfaction and team morale.

Internship Programs

For those starting their career, CrowdStrike Holdings, Inc. offers internship programs that provide a rich learning environment. Interns gain hands-on experience, working alongside seasoned professionals and participating in projects that deliver real-world solutions.

Networking and Career Advancement

CrowdStrike Holdings, Inc. emphasizes the importance of networking within the industry, offering numerous opportunities for employees to connect with thought leaders and innovators. These connections can lead to career advancement and a deeper understanding of the cybersecurity landscape.

Applying for a Position

To apply for a position at CrowdStrike Holdings, Inc., candidates should prepare a resume that highlights relevant experience and skills. The interview process is designed to assess not only professional qualifications but also a candidate's fit within the company culture and team.

Stay Connected with CrowdStrike Careers

Interested candidates can stay informed about new openings and company news by subscribing to job alert emails. This personalized service ensures that potential applicants are the first to know about new opportunities that match their career interests and skills.

Join the Team

CrowdStrike Holdings, Inc. is looking for curious, creative, and solution-driven team players. Explore the employment opportunities on the CrowdStrike Holdings, Inc. careers page to find a position that matches your skills and passions.

SEARCH CROWDSTRIKE JOBS

Keep Up to Date

Stay ahead with career tips, insider perspectives, and industry-leading insights you can put to use today—all from the professionals who work at CrowdStrike Holdings, Inc.

READ CAREERS BLOG

Job Alert Emails

Customize your subscription to receive job alerts, latest news, and insider tips tailored to your preferences. Discover the exciting and rewarding career opportunities waiting at CrowdStrike Holdings, Inc.
Learn more about CrowdStrike Holdings, Inc.

Similar Jobs

More Jobs at CrowdStrike Holdings, Inc.

More Information Technology Jobs

Find similar Director, Model Post-Training and Agentic Research (Remote) jobs: