Job Description
About the Role
We are seeking a Staff Technical Program Manager (TPM) to lead AV ML Infrastructure programs for our autonomous driving platform. In this role, you will drive strategy and execution for large-scale ML infrastructure - including training pipelines, model lifecycle management, compute orchestration, and operational reliability - that power next-generation autonomy models. You will operate at the intersection of ML engineering, platform infrastructure, and operations, ensuring our ML systems are scalable, efficient, and production-ready to support end-to-end model development at scale.
Key Responsibilities
Program Leadership
Lead end-to-end strategic planning and execution for AI ML Infrastructure programs, delivering measurable improvements in training throughput, platform reliability, and model development velocity. Establish clear program objectives, milestones, and success metrics to drive predictable, high-quality delivery across multiple engineering and operations teams.
Cross-Functional Alignment
Collaborate with AI ML engineering, platform, validation, and product teams to define requirements, prioritize initiatives, and deliver solutions that improve AI development cycle performance and operational efficiency.
Technical Road mapping
Translate complex MLOps needs - from distributed training orchestration to compute resource management and pipeline scaling - into actionable multi-team execution plans with defined owners and measurable outcomes. Align long-term technical roadmaps with organizational goals, ensuring ML infrastructure evolves to support increasing model complexity, dataset scale, and training workloads.
Risk & Change Management
Identify technical, operational, and program risks early; develop mitigation strategies that protect training timelines, platform stability, and service reliability.
Scalability & Performance
Ensure AI ML operations processes and infrastructure are designed for long-term scalability, performance, and operational excellence - including monitoring, incident response, and capacity planning.
Metrics & Visibility
Define KPIs for ML platform performance, training system reliability, model training cycle time, and delivery velocity; maintain transparent dashboards and executive-ready reporting. Provide leadership with clear insights into progress, tradeoffs, and program health to support timely decision-making.
Required Qualifications:
- 10+ years of technical program management experience, including leadership of large, complex, multi-disciplinary programs.
- 5+ years working in ML Operations, ML infrastructure, AI platform engineering, or distributed compute environments.
- BS or MS in Engineering, Computer Science, or a related technical field.
- Experience supporting large-scale machine learning training or AI infrastructure programs, including compute orchestration, pipeline reliability, and resource management.
- Proven track record of managing large, complex, cross-functional programs involving infrastructure, software systems, and data pipelines with ambiguous or evolving requirements.
- Ability to analyze system performance metrics, identify bottlenecks, and translate insights into program-level improvements.
- Exceptional communication, collaboration, and stakeholder management skills.
- Deep familiarity with Agile program delivery, task management tools (e.g., Jira), reporting tools, and technical development tooling.
Preferred Qualifications:
- Experience with GPU compute management, cluster orchestration (e.g., Kubernetes, Slurm), or cloud infrastructure (GCP, AWS).
- Familiarity with ML workflow orchestration tools (e.g., Kubeflow, Airflow, or similar).
- Background in SRE, platform engineering, or DevOps practices applied to ML systems.
- Experience with observability, SLO/SLI frameworks, and incident management for production ML platforms.
Relocation: This job is not eligible for relocation benefits. Any relocation costs would be the responsibility of the selected candidate.
GM does not provide immigration-related sponsorship for this role. Do not apply for this role if you will need GM immigration sponsorship now or in the future. This includes direct company sponsorship, entry of GM as the immigration employer of record on a government form, and any work authorization requiring a written submission or other immigration support from the company (e.g., H1-B, OPT, STEM OPT, CPT, TN, J-1, etc.)