Google

Senior Engineering Manager AI Inference Platform, Distributed Cloud

Google$262K — $365K *
Enterprise Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree or equivalent experience
  • 8 years programming in C or Python
  • 7 years optimizing production-grade systems on GPU accelerators or specialized AI hardware
  • 5 years leading engineering teams in machine learning infrastructure or distributed computing
  • 5 years in people management or team leadership
  • 4 years managing multi-team engineering dependencies

Responsibilities

  • Lead and mentor a high-performing team of systems and ML engineers
  • Define the technical vision for enhancing LLM serving stack
  • Oversee infrastructure for performance analysis and benchmarking of LLM models
  • Collaborate with Research, SRE, Product, and core library teams for LLM optimization
  • Design and implement advanced serving architectures, minimizing performance bottlenecks

Benefits

  • Career development support and growth opportunities
  • Access to cutting-edge technology and projects
  • Comprehensive health and wellness programs
  • Flexible work environment and work-life balance
  • Collaborative culture focused on learning and innovation
Full Job Description
Minimum qualifications:
  • Bachelor's degree or equivalent practical experience.
  • 8 years of experience programming in C or Python.
  • 7 years of experience optimizing, profiling, and scaling production-grade systems on GPU accelerators or specialized AI hardware.
  • 5 years of experience directly managing and leading engineering teams focused on machine learning infrastructure, AI platforms, or high-performance distributed computing systems.
  • 5 years of experience in a people management or team leadership role.
  • 4 years of experience managing engineering organizations across multi-team infrastructure dependencies.

Preferred qualifications:
  • Master's degree or PhD in Engineering, Computer Science, or a related technical field.
  • 5 years of experience working in a complex, matrixed organization.
  • 5 years of experience implementing advanced LLM serving architectures and optimization techniques, such as disaggregated serving, continuous batching, or specialized compiler technologies (e.g., XLA).
  • 4 years of experience utilizing deep-dive ML profiling tools (e.g., Nsight, xprof) to troubleshoot and resolve low-level bottlenecks within major frameworks like JAX, PyTorch, or TensorFlow.


About the job

In this role, you will be pivotal in architecting and optimizing the serving stack for models like Gemini in an on-prem cloud environment, addressing exciting challenges to improve speed and efficiency. This is a unique opportunity to go deep, leading system-level design and performance profiling, ensuring Google's LLMs run faster and more cost-effectively than ever before.

Individual pay is determined by factors including job-related skills, experience, and relevant education or training.

US: $262000 - $365000 (USD) 25% bonus target equity benefits

Learn more about benefits at Google .

Responsibilities
  • Lead, mentor, and grow a high-performing team of systems and ML engineers. Drive a culture of excellence, psychological safety, and continuous learning while guiding career paths and OKRs.
  • Define the technical vision and strategy for enhancing the LLM serving stack, focusing on performance, scalability, and resource efficiency.
  • Oversee the infrastructure and tooling for in-depth performance analysis, profiling, and benchmarking of LLM models on GPU accelerators.
  • Partner closely with Research, SRE, Product, and core library teams to optimize and deploy LLMs globally.
  • Drive the design, implementation, and optimization of advanced serving architectures-including disaggregated serving-while collaborating with core library and kernel partners to eliminate low-level performance bottlenecks, maximize resource utilization, and minimize latency.


About Google

Google is a multinational technology company that specializes in Internet-related services and products. These include online advertising technologies, search engine, cloud computing, software, and hardware. Google was founded in 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University. The company has grown tremendously since then and has become one of the most valuable companies in the world. Google's mission is to organize the world's information and make it universally accessible and useful.
Learn more about Google
Size
156,500 employees
Market Cap
$1,115.4 billion
Industry
Net Income
$40.2 billion
Founded
1998
5 Year Trend
+23.3%
Revenue
$182.5 billion
NASDAQ

Similar Jobs

More Jobs at Google

More Enterprise Technology Jobs

Find similar Senior Engineering Manager AI Inference Platform, Distributed Cloud jobs: