NVIDIA has a rapidly expanding ecosystem of data center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink domain rack architectures. These designs have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. Each bringing together the full power of NVIDIA GPUs, NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're searching for a highly motivated, technical leader to design, drive, and operationalize rack-scale factory and deployment flows for next-generation data center products. The ideal candidate will combine deep systems expertise, decisive technical leadership, and a passion for building reliable, debuggable, and scalable manufacturing and deployment solutions.
What you'll be doing:- Lead and drive rack-scale/L11 flows for factory and initial data center deployment.
- Design and implement end-to-end factory workflows, including firmware flashing sequences, security provisioning, and deployment of software mitigations.
- Collaborate with data center architects, ODMs, and OEMs to define factory and data center requirements that ensure efficient and reliable production ramp.
- Champion reliability, debuggability and optimization in firmware, diagnostic and deployment tool design.
- Drive pre-silicon readiness for factory & manufacturing workflows for rack-scale products. Using NVIDIA's industry leading simulation & emulation technology.
- Mentor architects and engineering teams to grow them into future leaders.
- Make key technical decisions even when faced with ambiguity
What we need to see:- BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience.
- 8+ years in the area of System architecture and design
- Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
- Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband)
- Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs
- Expertise in out-of-band and in-band management architectures.
- Knowledge of system management protocols such as Redfish and IPMI.
- Demonstrable experience in implementing left shift strategy to de-risk program execution. Excellent written and verbal communication skills.
Ways to stand out from the crowd:- Knowledge of large-scale cloud and cluster level deployment and management systems.
- Demonstrated track record of leading data center products across the entire lifecycle, spanning inception, pre-silicon development, post-silicon bring-up, manufacturing, and deployment.
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until June 29, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.