Machine Learning Engineer, AI

Biohub

• $214K — $335K *

New York, NY 10025Hybrid

Pharmaceuticals & Biotech

Less than 5 years of experience

Reposted Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Hands-on experience with PyTorch, including custom training loops and distributed training.
Proficiency in GPU-native data I/O and large-scale tensor formats like Zarr, HDF5, and TensorStore.
Familiarity with distributed computing frameworks such as Spark, Dask, or Ray.
Experience with Docker and Kubernetes for container orchestration.
Proven track record of building systems crucial for engineering and research communities.
Bonus: Experience in developing AI agent frameworks.

Responsibilities

Build and maintain pre-training infrastructure spanning thousands of GPUs.
Design and optimize GPU-native data loading pipelines for petabyte-scale scientific workloads.
Develop I/O and pipeline systems tailored for advanced biological data types.
Define key abstractions to facilitate long-term research developments.
Manage the full ML lifecycle including artifact tracking and monitoring.
Create DevOps tooling to enhance productivity for engineers and researchers.
Deploy Biohub's technology globally, supporting scientific tools.

Benefits

Generous employer 401(k) match to assist with future financial planning.
Paid time off dedicated to volunteering at chosen organizations.
Funding available for selected family-forming benefits.
Relocation support for employees needing assistance with moving.

Full Job Description

As an ML Engineer, you'll join some of the strongest infrastructure engineers in AI, building the systems that connect everything together. The infrastructure problems you solve directly determine what science becomes possible.
What You'll Do

Work with high-dimensional scientific data formats and contribute to backend compatibility, format evaluation, and I/O performance benchmarking at petabyte scale.
Define and shape the engineering patterns your team and collaborating researchers will build on for years; the abstractions you write today become the foundation others depend on at scale.
Work at the intersection of AI systems and biological discovery, where the infrastructure problems you solve directly determine what science becomes possible.
Deploy models to production and manage artifact tracking across models and datasets.
Design and optimize GPU-native data loading pipelines for large-scale multi-dimensional tensor workloads, including profiling and resolving hardware utilization bottlenecks across multi-backend systems.
Work on simplification and improvement of codebase abstractions to accelerate research momentum.
Build and maintain primitives for pre-training infrastructure that ensure the reliability and continuity of large-scale training runs.
Help cultivate best practices in MLOps, and think about the full ML lifecycle, including data, fine-tuning, deployment, reliability and monitoring.
Possesses the ability to execute complex modifications to the research pipeline, such as fast data loading and distributed training.
Handle DevOps responsibilities, focused on making all engineers and researchers more productive. This includes tasks like cluster monitoring, unit testing and integration testing of research codebase, and continuous integration.
Collaborate with partner researchers and engineers to deploy our technology within external infrastructure.

What You'll Bring

5+ years of industry experience building and deploying machine learning infrastructure at scale.
Hands-on experience with PyTorch, including custom training loops, distributed training, or low-level performance work.
Familiarity with GPU-native data I/O tools and large-scale tensor formats (e.g. Zarr, HDF5, TensorStore, or similar).
Experience with distributed computing frameworks such as Apache Spark, Dask, or Ray.
Familiarity with containerization and orchestration tools such as Docker and Kubernetes.
Experience building or working with AI agent frameworks is a plus.
A track record of building systems that other engineers and researchers depend on. Not just running experiments, but shipping infrastructure that scales.

Compensation

The future anticipated Redwood City, CA, and New York City, NY base pay range for a role in this field is $214,000 to $335,000 annually. Final compensation is based on the level at which you are hired. Actual placement in range is based on job-related skills and experience, as evaluated throughout the interview process.
Benefits for the Whole You

We're thankful to have an incredible team behind our work. To honor their commitment, we offer a wide range of benefits to support the people who make all we do possible.

Provides a generous employer match on employee 401(k) contributions to support planning for the future.
Paid time off to volunteer at an organization of your choice.
Funding for select family-forming benefits.
Relocation support for employees who need assistance moving

#LI-Hybrid

* Ladders Estimates

Similar Jobs

Staff Software Engineer, Backend (Consumer - Prediction Markets)
$218K — $256K *
Coinbase Careers Page
Remote
Today
Staff Backend Engineer
$75K — $500K *
MLabs
New York, NY 10025 (New York County)
Reposted Today
Staff Software Engineer
$202K — $274K *
Intuit Inc
New York, NY 10025 (New York County)
Reposted Today
Staff Forward Deployed Platform Engineer
$185K — $265K *
Charlie Health Outreach
New York, NY 10025 (New York County)
Today
Staff Software Engineer, Embedded Finance
$193K — $309K *
Toast
Remote
Reposted Today
Staff Engineer - Edge Protocols
$211K — $253K *
Fastly
New York City, NY 10025 (New York County)
Today

Get Ready For Your
Next Interview

More Jobs at Biohub

Research Engineer, AI
$214K — $375K *
Redwood City, CA 94061 (San Mateo County)
Reposted Today
Healthcare
Hybrid
Staff HPC Engineer
$214K — $268K *
San Francisco, CA 94112 (San Francisco County)
Reposted Today
Enterprise Technology
Hybrid
Staff Software Engineer, Data Infrastructure, AI Compute Platform
$214K — $295K *
Redwood City, CA 94061 (San Mateo County)
Reposted Today
Information Technology
Hybrid
Computational Biologist
$130K — $163K *
Chicago, IL 60629 (Cook County)
Reposted Today
Pharmaceuticals & Biotech
In-Person
Postdoctoral Fellow, Single-Cell Genomics
$84K *
Chicago, IL 60629 (Cook County)
Reposted Today
Pharmaceuticals & Biotech
In-Person

More Pharmaceuticals & Biotech Jobs

Senior Manager FP&A
$122K — $153K *
ACADIA Pharmaceuticals
San Diego, CA 92154 (San Diego County)
Today
Sr. Mgr. Manufacturing
$154K — $208K *
Amgen Inc
West Greenwich, RI 02817 (Kent County)
Today
Medical Science Liason/Sr. Medical Science Liason- Upper Midwest
$225K — $250K *
BridgeBio Pharma, Inc.
Remote
Today
Professional (Senior) Sales Representative, Pain - Lubbock, TX
$100K — $162K *
Viatris Inc.
Lubbock, TX 79424 (Lubbock County)
Today
Director, US Compliance
$267K — $306K *
Revolution Medicines
Redwood City, CA 94061 (San Mateo County)
Today

Find similar Machine Learning Engineer, AI jobs:

Nationwide New York, NY

Machine Learning Engineer, AI

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Machine Learning Engineer, AI jobs:

Get Ready For Your
Next Interview