Member of Technical Staff, Post-Training, RL Infra

Mirendil

$350K — $500K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of experience in engineering, specifically with reinforcement learning (RL) systems
  • Proven expertise in building scalable infrastructure for machine learning models
  • Strong programming skills, preferably in Python or similar languages
  • Familiarity with performance optimization techniques in large-scale systems
  • Experience collaborating with cross-functional teams and translating research into production systems

Responsibilities

  • Design and build robust infrastructure for large-scale RL training
  • Implement performance optimizations across the training stack
  • Develop evaluation and benchmarking systems for model assessment
  • Create data collection and feedback pipelines for iterative training improvements
  • Collaborate with various teams to accelerate the deployment of RL experiments

Benefits

  • Meaningful equity grant based on experience
  • Competitive benefits package
Full Job Description
The Role

We are looking for engineers to help build the post-training stack for frontier reasoning models. This role sits at the intersection of research and infrastructure. You will work to push the scale of our RL stack, whether it is novel recipe ideas, reliability, or performance. Some example areas you might work on (not limited to):

  • Design and build reliable infrastructure for large-scale RL training
  • Implement novel performance optimizations across the training stack
  • Develop evaluation and benchmarking infrastructure to measure model progress, throughput, and uptime
  • Build data collection and feedback pipelines that close the loop between human signal, reward modeling, and training
  • Collaborate with multiple teams to rapidly iterate on RL algorithms and get experiments into production training runs


If you're excited about building the infrastructure that makes frontier RL research possible at scale, we'd love to hear from you.

We offer a base salary of $350,000-$500,000 USD and a meaningful equity grant, depending on experience and background, along with competitive benefits.

Similar Jobs

More Jobs at Mirendil

More Information Technology Jobs

Find similar Member of Technical Staff, Post-Training, RL Infra jobs: