Boston Dynamics

Senior Engineering Manager, ML Platform

Boston Dynamics$198K — $300K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7-12 years of engineering experience, with 2-3 years in a formal management capacity
  • Experience in building or scaling a platform or ML systems team
  • Technical expertise in GPU/distributed compute infrastructure, data storage, or data pipeline frameworks
  • Proven ability to make foundational architectural decisions in an early-stage environment
  • Strong communication skills to bridge technical and non-technical stakeholders
  • A hands-on approach to coding, design review, and troubleshooting
  • Comfortable managing ambiguity and defining roadmaps.

Responsibilities

  • Own strategy and execution for scalable GPU compute infrastructure
  • Design and implement infrastructure while the team expands
  • Drive reliability and performance of distributed training clusters
  • Evaluate new hardware and cloud infrastructure for evolving team needs
  • Oversee the design of data storage and retrieval systems
  • Ensure performant and fault-tolerant data pipelines
  • Lead the development of shared libraries for data transformation and enhance developer productivity
  • Communicate technical priorities and roadmap to both engineers and leadership
  • Mentor engineers and establish a strong team culture
  • Define team processes to ensure sustainability as the team grows.

Benefits

  • Generous medical, dental, and vision coverage
  • 401(k) plan
  • Paid time off
  • Annual bonus structure.
Full Job Description
We're looking for a Senior Engineering Manager to lead our ML Platform Team - a growing team responsible for the foundational infrastructure that powers our machine learning work. This is a player-coach role: you'll set technical direction and contribute hands-on while building out the team and establishing the processes that will scale with it.

The platform is in its early stages, with some foundations in place. You'll be joining at a pivotal moment - making architectural decisions that will shape how the team and the platform grow from 4 engineers today to a team of 10-12.

What You'll Work On

Infrastructure Leadership
  • Own the strategy, roadmap, and execution for GPU compute infrastructure, ensuring it scales to meet growing model training and fine-tuning demands
  • Contribute directly to infrastructure design and implementation, particularly in the near term as the team grows
  • Drive reliability, performance, and cost efficiency across distributed training clusters. Optimize existing and new training workloads to achieve scale.
  • Evaluate and adopt new hardware (GPUs, TPUs, custom accelerators) and cloud/on-prem infrastructure as the team's needs evolve


Data Platform Ownership
  • Oversee the design and operation of data storage, indexing, and retrieval systems that support large-scale dataset generation
  • Ensure data pipelines are performant, fault-tolerant, and meet the quality and freshness requirements of ML teams
  • Establish early-stage standards for data access, lineage, and governance - pragmatic and scalable, not over-engineered


Shared Tooling & Developer Experience
  • Lead the development and maintenance of shared libraries and frameworks for data transformation pipelines
  • Partner with ML researchers and engineers to understand their workflows and translate them into reliable, reusable platform capabilities
  • Champion developer productivity - reduce friction for teams consuming platform services


Technical Strategy & Architecture
  • Lay the architectural foundations of the platform, making decisions that are pragmatic today but designed to scale to a 10-12 person team and beyond
  • Make key architectural decisions around compute orchestration (e.g. Kubernetes, Slurm, Ray), storage systems, and pipeline frameworks
  • Balance short-term delivery with long-term platform health -knowing when to build, buy, or borrow


Cross-functional Collaboration
  • Act as a technical partner to ML research, data engineering, and product teams - translating needs into platform priorities
  • Communicate roadmap, incidents, and technical tradeoffs clearly to both engineers and senior leadership
  • Help ML teams become self-sufficient on the platform, reducing bottlenecks on the platform team itself


Team Building & Management
  • Actively participate in hiring to grow the team from 4 to ~10-12 engineers, including defining roles and leveling
  • Mentor and develop engineers, establishing a team culture early that will hold as headcount scales
  • Define lightweight but durable team processes - on-call rotations, incident response, and engineering standards that won't need to be rebuilt at scale
  • Be comfortable doing IC work yourself while simultaneously building the team's capacity to take it on


What We're Looking For
  • 7-12 years of engineering experience, with at least 2-3 years in a formal management or tech lead capacity
  • Demonstrated experience building or scaling a platform, infrastructure, or ML systems team from the ground up
  • Technical credibility in one or more of: GPU/distributed compute infrastructure, large-scale data storage and retrieval, or data pipeline frameworks
  • Experience making foundational architectural decisions in an early-stage or greenfield environment
  • Strong cross-functional communication skills - able to translate between ML researchers, engineers, and senior leadership
  • Comfortable with ambiguity; able to define the roadmap rather than just execute against one
  • A hands-on mindset - willing and able to write code, review designs, and debug production issues alongside your team


Nice to Have
  • Familiarity with compute orchestration frameworks such as Kubernetes, Slurm, or Ray
  • Experience with ML training workflows, dataset generation pipelines, or feature stores
  • Prior experience growing a team through a hiring ramp (e.g. doubling or tripling headcount)


The base pay range for this position is between $198,000.00 to $300,000.00 annually. Base pay will depend on multiple individualized factors, including, but not limited to, internal equity, job-related knowledge, skills, and experience. This range represents a good-faith estimate of compensation at the time of posting. Boston Dynamics offers a generous Benefits package including medical, dental, vision, 401(k), paid time off, and an annual bonus structure. Additional details regarding these benefit plans will be provided if an employee receives an offer for employment.

About Boston Dynamics

Boston Dynamics is an American engineering and robotics design company founded in 1992 as a spin-off from the Massachusetts Institute of Technology. The company is best known for the development of BigDog, a quadruped robot designed for the U.S. military. Boston Dynamics has also developed a number of other robots, including Spot, a four-legged robot designed for indoor and outdoor operation, and Atlas, a humanoid robot designed for a variety of search and rescue tasks. In 2013, the company was acquired by Google X, a subsidiary of Alphabet Inc. In 2020, the company was acquired by Hyundai Motor Group. Boston Dynamics is headquartered in Waltham, Massachusetts.
Learn more about Boston Dynamics
Size
300 employees
Industry
Founded
1992

Similar Jobs

More Jobs at Boston Dynamics

More Information Technology Jobs

Find similar Senior Engineering Manager, ML Platform jobs: