We are looking for a
Lead Technical Program Manager to lead the data center build-outs for our high-density GPUs. You will be a driving force between hardware vendors, Data Center Architects, and the end to end oversight of doing a GPU Deployment. You'll join a fast-paced team dedicated to scaling our Agentic AI Cloud, making sure our physical hardware is deployed quickly and correctly to meet our customers' heavy compute needs.
What You'll Be Doing:- Support hardware deployment projects for high-density GPU clusters, coordinating the steps from early site planning all the way through the final rack-and-stack.
- Manage hardware delivery schedules tracking vendor lead times and working closely with the supply chain team to ensure servers and equipment arrive at the data center exactly when needed.
- Coordinate physical installations by partnering with colocation providers, remote hands, and internal networking teams to make sure space, power, and cables are prepped before hardware arrives.
- Act as the central hub of communication keeping supply chain, facilities, hardware engineers, and outside vendors aligned to keep deployment schedules on track.
- Help translate customer AI needs like the difference between building for training versus inference into clear deployment plans for our hardware and facility teams.
- Build repeatable deployment processes identifying logistical roadblocks and fixing them so we can consistently get new GPUs online faster.
What We'll Expect From You:- 10+ years of experience as a Technical Program Manager, successfully coordinating hardware, facilities, or data center infrastructure projects.
- Strong understanding of hardware deployment workflows, ideally from working at a cloud provider, colocation facility, or specialized AI company.
- Familiarity with basic data center concepts, meaning you understand the fundamentals of rack space, power limits, and cooling requirements needed for dense AI hardware.
- A solid grasp of the procurement lifecycle, including tracking orders, forecasting demand, and managing long lead times to ensure hardware shows up on time.
- Basic familiarity with the AI/ML hardware space, understanding why customers use GPUs and the high-level physical differences required for training versus inference workloads.
- Experience with external vendor coordination, effectively working with colocation facility managers or remote hands to guide physical work happening off-site.
- Strong risk management skills, with a proven ability to spot deployment risks early, whether it is a delayed hardware shipment or a facility constraint, and build plans to keep the project moving.
- Cross-functional teamwork capabilities, acting as a reliable bridge between deeply technical teams (like network and data center engineering) and operational teams (like supply chain).
- Proficiency in project management tooling, feeling comfortable using standard tracking and ticketing systems to document deployment steps, track hardware inventory, and report on progress.
- Excellent communication and organization skills, with a proven ability to bring order to fast-moving environments and keep diverse teams focused on a shared timeline.
Compensation Range: *This is a hybrid role
#LI-Hybrid
Application Limit: You may apply to a maximum of 3 positions within any 180-day period. This policy promotes better role-candidate matching and encourages thoughtful applications where your qualifications align most strongly.