AI/ML Scientist - Protein Foundation Models

Manifold Bio

$130K — $180K *
Pharmaceuticals & Biotech
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Experience in pretraining or fine-tuning protein foundation models, showcasing publishable results.
  • Knowledge of AlphaFold architecture and methodology.
  • 2+ years in deep learning frameworks like PyTorch or JAX.
  • Proficient in large-scale model training and optimization techniques.
  • Understanding of deep learning architectures including transformers and attention mechanisms.
  • Familiarity with protein structure data such as PDB and mmCIF.
  • Strong skills in statistical analysis and experimental design.

Responsibilities

  • Advance the training of various foundation models using proprietary data.
  • Enhance model training methodologies and architecture selection.
  • Create and manage scalable training pipelines for multi-GPU and multi-node setups.
  • Integrate model outputs into existing systems to improve design processes.
  • Conduct structured ML experiments and rigorous evaluations.
  • Establish best practices in training efficiency and resource management.
  • Document and analyze architecture and training processes comprehensively.

Benefits

  • Join a cutting-edge AI team focused on advanced protein modeling.
  • Work with unique proprietary datasets in a collaborative environment.
  • Take ownership of significant projects that impact drug discovery.
  • Engage with a talented team pushing the boundaries of AI in biotech.
Full Job Description
Position

Manifold's AI team is actively training protein foundation models on our proprietary experimental datasets. Our generative antibody design model, mBER, has already demonstrated controllable de novo binder design across multiple million-scale screening campaigns, and the team is now scaling foundation model capabilities to push well beyond current performance. We are looking for an AI/ML Scientist to join this effort. You will work alongside our existing model training team to accelerate the development of foundation models fine-tuned on Manifold's data, bringing additional depth in pre-training methodology, architecture development, and large-scale training. Your work will directly improve mBER's design capabilities and unlock new modeling paradigms for the broader team. You'll own foundation model projects end-to-end, from architecture selection and training infrastructure to evaluation against real experimental outcomes, while contributing to the team's shared research agenda.

This is an on-site role and can be based in either Boston, Massachusetts or San Francisco, California. Please only apply if you reside in these cities or are open to relocate.

Responsibilities
  • Advance the team's ongoing foundation model training efforts-pretraining, fine-tuning, and evaluating folding, docking, language, and generative design models on Manifold's proprietary experimental data
  • Bring depth in training methodology, architecture selection, and optimization to complement the existing team's expertise
  • Develop and scale training pipelines for distributed, multi-GPU and multi-node training runs
  • Integrate foundation model outputs into mBER to improve binder design success rates and enable new design capabilities
  • Design and execute ML experiments with clear hypotheses, rigorous evaluation frameworks, and systematic analysis
  • Establish best practices for mixed-precision training, gradient checkpointing, and computational efficiency at scale
  • Produce clear documentation and analysis supporting architecture and training decisions

Required Qualifications
  • Demonstrated experience pretraining and/or fine-tuning protein foundation models (folding, docking, language models, or generative design) with published or otherwise demonstrable results
  • Strong familiarity with AlphaFold architecture and training methodology
  • 2+ years of hands-on experience with PyTorch and/or JAX for deep learning
  • Experience with large-scale model training: distributed training, multi-GPU/multi-node setups, mixed precision, gradient checkpointing
  • Solid understanding of deep learning architectures (transformers, attention mechanisms, diffusion/flow matching) and optimization techniques
  • Experience working with protein structure data (PDB, mmCIF) and/or protein sequence datasets
  • Strong statistical analysis and experimental design skills
  • Proficiency in Python scientific computing stack (NumPy, Pandas, scikit-learn)
  • Self-directed researcher who can balance guidance with independence
  • Excellent written and verbal communication skills for cross-functional collaboration

Preferred Qualifications
  • Experience with protein generative design methods (e.g., RFdiffusion, ProteinMPNN, flow matching approaches)
  • Experience with protein language models (e.g., ESM family)
  • Published research in computational biology, protein design, or structural biology
  • Experience training on proprietary or domain-specific biological datasets
  • Familiarity with Ray for distributed computing
  • Experience with Kubernetes (EKS) and cloud computing platforms (AWS)
  • Knowledge of protein engineering, directed evolution, or structural biology wet lab techniques
  • Experience working with agentic AI coding tools for fast, parallelized execution of modeling experiments
  • Previous biotech/pharma industry experience

This Role Might Be Perfect For You If:
  • You have deep experience training protein foundation models and want to apply that expertise to some of the richest proprietary experimental datasets in the field
  • You're excited about pushing beyond public model performance by leveraging unique, large-scale in vivo screening data
  • You thrive in high-ownership roles where you can drive research direction while collaborating with a tight-knit, world-class team
  • You want your models to directly impact real drug discovery programs

If you're excited to train the next generation of protein foundation models on uniquely powerful experimental data, please reach out to [redacted].

Similar Jobs

More Jobs at Manifold Bio

More Pharmaceuticals & Biotech Jobs

Find similar AI/ML Scientist - Protein Foundation Models jobs: