Research Scientist: Pretraining

Generalist AI, Inc

$100K — $150K *
Consumer Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in large model training (transformer or diffusion)
  • Demonstrated leadership in multi-node, multi-GPU distributed training
  • Expertise in scaling laws and optimizations
  • Proficient in PyTorch with troubleshooting at all stack layers
  • Strong emphasis on empirical rigor and rapid iterations
  • Passion for developing foundational robotic intelligence

Responsibilities

  • Design and execute large-scale pretraining runs for robot models
  • Define model architectures and training goals using multimodal data
  • Develop scalable datasets and strategic sampling methods
  • Lead data collection initiatives and identify new datasets
  • Conduct ablation studies to analyze scaling laws and data quality
  • Collaborate with ML infrastructure teams to enhance system performance
  • Convert raw robotic data into actionable model capabilities

Benefits

  • Flexible work hours
  • Opportunity to work at the forefront of robotics and AI
  • Collaborative and innovative team environment
  • Access to cutting-edge technology and resources
  • Professional development opportunities
Full Job Description
About the Role

You will build the base intelligence layer for robotics. We train large-scale robot foundation models from massive multimodal datasets spanning video, proprioception, action traces, language, and more. You will design and run the core large-scale training efforts that give our models fundamentally new general capabilities across embodiments, tasks, and environments. You will "live and breathe" all forms of robot data.

You'll be responsible for:
  • Designing and executing large-scale pretraining runs for robot foundation models (transformer- and diffusion-based architectures)
  • Defining model architectures, objectives, and training curricula across multimodal robotic data (vision, action, state, language)
  • Developing scalable data mixtures and sampling strategies across petabyte-scale datasets
  • Guiding data collection operations towards new directions, as well as sourcing new datasets
  • Running ablations to understand scaling laws, data quality effects, and architecture tradeoffs
  • Collaborating closely with ML Infra and Systems to push cluster utilization, throughput, and reliability
  • Turning raw robotic interaction data into generalizable model capabilities


You might thrive in this role if you:
  • Have deep experience training large transformer or diffusion models at scale (for generative models e.g. including language models, audio models, or video models)
  • Have led or significantly contributed to multi-node, multi-GPU distributed training efforts
  • Have worked on scaling laws, optimization dynamics, and large-model failure modes
  • Have strong PyTorch fundamentals and comfort debugging at every layer of the stack
  • Care about both empirical rigor and raw iteration speed
  • Are excited about building general-purpose robot intelligence from first principles

Similar Jobs

More Consumer Technology Jobs

Find similar Research Scientist: Pretraining jobs: