Technical Program Manager, Data Center Operations

TensorWave

$120K — $150K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree in a technical field such as Computer Science or Engineering.
  • 5+ years of technical program management experience, particularly in data center operations.
  • 3+ years of direct management experience with infrastructure programs at scale.
  • Proven ability to collaborate across facilities and engineering teams in a live environment.
  • Experience with operational systems in data centers including power and cooling.
  • Strong communication skills for translating technical information to non-technical audiences.
  • Proficiency with program management tools such as Jira.

Responsibilities

  • Manage the full lifecycle of data center operations programs across multiple sites.
  • Coordinate efforts between facilities, networking, hardware, and software teams to ensure SLA adherence.
  • Track program milestones and resources for concurrent multi-site operations.
  • Report operational status and risk to engineering and executive leadership.
  • Identify and mitigate potential risks to site reliability and uptime.
  • Oversee hardware deployment and maintenance sequencing.
  • Conduct post-incident reviews to implement process improvements.

Benefits

  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance for Employees
  • Contributions to Company Health Savings Account
  • 100% paid Short Term and Long Term Disability Insurance for Employees
  • Life and Voluntary Supplemental Insurance Options
  • Various Supplementary Health Benefits
  • Flexible Spending Account
  • 401(k)
  • Employee Assistance Program
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Other In-Office Perks
Full Job Description
About the Role

TensorWave is seeking an experienced Technical Program Manager with a strong data center operations background to lead and scale the operational programs that keep our next-generation AI infrastructure running at peak performance.

In this role, you'll own the full lifecycle of data center operations programs: from hardware deployment and capacity management through incident response, change management, and continuous reliability improvement. You'll be the operational spine connecting facilities engineers, network, hardware, DevOps, and SRE teams, and executive leadership to ensure our AMD-powered AI clusters deliver on customer commitments at scale, with zero tolerance for preventable downtime.

This is a high-visibility, high-impact role for someone who brings operational discipline to complex, fast-moving environments and maintains clear communication and structured execution under pressure.

What You'll Do
  • Own end-to-end program management for data center operations across multiple sites, covering hardware lifecycle, capacity planning, change management, incident response, and operational readiness
  • Serve as the primary coordination point across facilities, networking, hardware, and software teams: driving accountability to operational SLAs, program schedules, and customer commitments
  • Define and track program milestones, critical path dependencies, and resource requirements across concurrent multi-site operational programs
  • Translate operational status and risk into clear reporting for engineering, product, and executive leadership audiences
  • Identify, escalate, and mitigate risks to site reliability, capacity availability, and customer-facing uptime before they become incidents
  • Coordinate hardware deployment, node lifecycle, and maintenance sequencing across sites in alignment with capacity and customer commitments
  • Partner with network, power, facilities, and infrastructure engineering teams to drive operational readiness for high-density GPU compute clusters
  • Own post-incident retrospectives and corrective action tracking, driving lessons learned into durable process improvements
  • Maintain program documentation including operational runbooks, risk registers, vendor trackers, capacity plans, and change management records


Who You Are

Required Qualifications
  • Bachelor's degree in Computer Science, Electrical/Mechanical Engineering, Information Technology (or a related technical field or equivalent practical experience)
  • 5+ years of technical program management experience, with at least 3 years directly managing data center operations, infrastructure programs, or critical facilities at scale
  • Demonstrated experience coordinating across facilities, network, and hardware engineering teams in a live production environment
  • Familiarity with data center operational systems and infrastructure: power distribution, cooling, structured cabling, and physical layer dependencies
  • Proven track record managing complex, multi-site operational programs on compressed timelines in a high-growth environment
  • Strong technical communication skills: able to translate operational status and risk to both field teams and executive stakeholders
  • Experience with Jira or equivalent program management tooling for milestone tracking, incident management, and cross-team coordination

Preferred Qualifications
  • Experience with high-density GPU or AI compute deployments and their operational demands
  • Background managing colocation or multi-tenant infrastructure environments
  • Familiarity with network infrastructure in data center environments: top-of-rack switching, structured fiber, spine-leaf topology
  • Experience with observability tooling (Grafana, Prometheus, or equivalent) for operational visibility
  • Prior experience at a hyperscaler, cloud provider, or high-growth AI infrastructure company
  • PMP or equivalent project management certification


What We Offer
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance for Employees
  • Company Health Savings Account Contributions
  • 100% paid Short Term and Long Term Disability Insurance for Employees
  • Life and Voluntary Supplemental Insurance Options
  • Other Insurance Options, such as Pet & Legal Insurance
  • Various Supplementary Health Benefits, such as discounted Virtual Healthcare Appointments and Serious Illness Support
  • Flexible Spending Account
  • 401(k)
  • Employee Assistance Program
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Other In-Office Perks

Similar Jobs

More Jobs at TensorWave

More Information Technology Jobs

Find similar Technical Program Manager, Data Center Operations jobs: