OpenAI

CPU/Storage/PoP-WAN Program Manager

OpenAI$130K — $180K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years in technical program management, infrastructure deployment, network deployment, or data center operations
  • Experience with programs involving compute, storage, networking, or large-scale infrastructure systems
  • Knowledge of servers, clusters, storage arrays, routers, switches, and structured cabling
  • Proven track record managing cross-functional programs across various teams and external vendors
  • Understanding of deployment lifecycles from planning to production handoff
  • Ability to navigate physical and logical systems architecture dependencies
  • Strong skills in building schedules and driving accountability among stakeholders
  • Executive communication skills with experience in managing escalations and updates
  • Comfortable in fast-paced environments with shifting priorities
  • Highly analytical with strong problem-solving skills

Responsibilities

  • Lead execution of CPU/GPU cluster activation across OpenAI's infrastructure
  • Drive readiness for turning contracted compute capacity into operational clusters
  • Own deployment programs for new PoPs, backbone nodes, and WAN expansion
  • Build integrated schedules covering procurement, logistics, and system testing
  • Coordinate readiness of BOM, server deliveries, racks, and vendor milestones
  • Align compute, storage, and networking dependencies with engineering teams
  • Manage deployment of storage systems supporting training workloads and scaling plans
  • Coordinate backbone capacity expansion and cloud interconnect readiness
  • Lead execution of physical deployments including hardware configuration and site acceptance
  • Create repeatable deployment playbooks and governance frameworks
  • Identify risks early and develop mitigation plans

Benefits

  • Comprehensive health insurance packages
  • Generous paid time off policy
  • Retirement savings plan options
  • Opportunities for professional development and training
  • Flexible work environment including remote work options
Full Job Description
About the Role

We are seeking a highly technical Program Manager to lead execution across CPU, Storage, PoP, and WAN infrastructure programs that directly unlock OpenAI's next generation compute capacity.

In this role, you will own complex cross-functional programs spanning compute cluster activation, storage deployment, PoP bring-up, and backbone expansion. You will coordinate hardware readiness, site readiness, network pathing, storage availability, vendor execution, and engineering dependencies required to turn contracted infrastructure into live training and inference capacity.

This role requires strong technical fluency across hardware systems, network infrastructure, storage architecture, and deployment execution. You should be comfortable operating from rack-level implementation details through executive-level capacity planning discussions.

This role is based in San Francisco, CA, with travel as needed.

Key Responsibilities
  • Lead end-to-end execution of CPU / GPU cluster activation programs across OpenAI's global infrastructure footprint
  • Drive readiness to convert contracted compute capacity into schedulable production clusters
  • Own deployment programs for new PoPs, backbone nodes, WAN expansion, and interconnection initiatives
  • Build integrated schedules spanning procurement, logistics, installation, storage readiness, network turn-up, testing, and production handoff
  • Coordinate BOM readiness, server delivery, racks, optics, cabling, storage hardware, and vendor milestones
  • Partner with engineering teams to align compute, storage, and networking dependencies before cluster activation
  • Manage deployment of storage systems supporting training and inference workloads, including readiness, validation, performance checks, and scaling plans
  • Coordinate backbone capacity expansion, cross-connects, inter-region pathing, and cloud interconnect readiness with Azure and third-party providers
  • Lead physical deployment execution including rack-and-stack, hardware bring-up, L1 validation, and site acceptance criteria
  • Build repeatable deployment playbooks, dashboards, governance cadences, and operating mechanisms for scale
  • Identify risks early across supply chain, site readiness, technical constraints, and vendor execution, then drive mitigation plans
  • Communicate milestones, escalations, and capacity forecasts to senior leadership

Qualifications
  • 8+ years of experience in technical program management, infrastructure deployment, network deployment, or data center operations
  • Strong experience delivering programs involving compute, storage, networking, or large-scale infrastructure systems
  • Working knowledge of servers, clusters, storage arrays, routers, switches, optics, and structured cabling
  • Experience owning cross-functional programs across engineering, operations, supply chain, and external vendors
  • Strong understanding of deployment lifecycles from planning and procurement through production handoff
  • Ability to reason across physical infrastructure execution and logical systems architecture dependencies
  • Proven ability to build integrated schedules and drive accountability across multiple stakeholders
  • Strong executive communication skills with experience managing critical escalations and leadership updates
  • Comfortable operating in fast-moving environments with aggressive timelines and evolving priorities
  • Highly analytical with strong problem-solving and execution instincts

Preferred Skills
  • Experience at a hyperscaler, cloud provider, AI infrastructure company, or global network operator
  • Experience deploying GPU clusters, HPC systems, or large training environments
  • Familiarity with distributed storage systems and high-performance data infrastructure
  • Experience with PoP deployments, WAN backbone expansion, or global network buildouts
  • Experience working across first-party, colo, and cloud environments
  • Experience building repeatable infrastructure deployment systems in high-growth environments

About OpenAI

OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company was founded in 2015 by a group of technology leaders, including Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, and John Schulman. OpenAI's mission is to develop and promote friendly AI for the betterment of humanity. The company has developed a number of cutting-edge AI technologies, including GPT-3, a language processing system that can generate human-like text. OpenAI has received funding from a number of high-profile investors, including LinkedIn co-founder Reid Hoffman and venture capitalist Peter Thiel.
Learn more about OpenAI
Size
100 employees
Industry
Founded
2015

Similar Jobs

More Jobs at OpenAI

More Information Technology Jobs

Find similar CPU/Storage/PoP-WAN Program Manager jobs: