NVIDIA Corporation

Principal Software Engineer, E2E Performance and Goodput - CSP Engagements

NVIDIA Corporation$272K — $431K *
US-AnywhereRemote in United States
Information Technology
11 - 15 years of experience
Job Overview by Ladders

Qualifications

  • 15+ years in systems performance engineering, ideally in GPU/HPC/ML infrastructure
  • BS or MS in Computer Science, Computer Engineering, or equivalent experience
  • Proficient in GPU workload profiling tools like Nsight Systems
  • Understanding of distributed training performance dynamics
  • Knowledge of statistical methods for performance analysis
  • Strong data analysis and visualization skills using Python
  • Ability to communicate performance findings to technical and executive audiences

Responsibilities

  • Drive performance characterization work streams with key CSP customers
  • Gather and synthesize CSP performance feedback for optimization priorities
  • Update and validate performance tools for NVIDIA systems
  • Ensure CSP performance tooling reflects latest GPU capabilities
  • Conduct cross-CSP performance comparisons and analysis
  • Collaborate with CSPs to fulfill performance-related integration needs
  • Define testing strategies and tool requirements for performance validation

Benefits

  • Equity participation
  • Comprehensive benefits package
  • Opportunity to work with cutting-edge NVIDIA technology
  • Collaboration with leading CSP and hyperscale customers
Full Job Description
We're looking for a Principal Engineer to join our CSP Engagements team as the technical focal point for end-to-end performance, working directly with engineering teams of key CSP/hyperscale customers to ensure they achieve various performance targets on NVIDIA platforms. In this role, you will augment NVIDIA's performance and benchmark teams with a dedicated CSP-facing focus. You will drive work streams with CSP engineering teams to build shared understanding of platform performance characteristics, gather and incorporate their workload-specific feedback into NVIDIA's optimization priorities, and validate that performance targets are met in customer-representative configurations. Your cross-CSP visibility enables you to identify patterns and drive systemic improvements in documentation, configuration guidance, and tooling.

What you'll be doing:
  • Drive performance characterization work streams with engineering teams of key CSP/hyperscale customers - ensuring they understand platform performance expectations, profiling methodology, and tuning options for their specific workloads
  • Gather and synthesize CSP performance feedback - identify gaps between expected and actual throughput, and champion optimization priorities back into NVIDIA's CUDA, NCCL, driver, and firmware teams
  • Ensure key open-source performance and stress tools (e.g., STREAM, GPU Burn, GPU BLAST) are updated and validated for the latest NVIDIA rack-scale systems, GPU architectures, and CPU platforms - so customers and internal teams have reliable baseline measurements from day one
  • Work closely with CSPs to ensure their own performance and validation tooling reflects the latest GPU capabilities, memory hierarchy changes, and platform-specific tuning parameters
  • Conduct cross-CSP performance comparison and pattern analysis - identify configuration, software, or workload differences that explain performance gaps between deployments
  • Collaborate with CSPs to ensure performance-related integration work (profiling infrastructure, benchmark harnesses, config validation) is ready ahead of deployment milestones
  • Define test strategies and tooling requirements for performance validation - both for NVIDIA internal certification and customer acceptance


What we need to see:
  • 15+ years of experience in systems performance engineering, ideally in GPU/HPC/ML infrastructure. BS or MS in Computer Science, Computer Engineering, or related field (or equivalent experience)
  • Proficiency in GPU workload profiling: nsight systems, nsight compute, DCGM metrics, or equivalent instrumentation
  • Understanding of distributed training performance dynamics: computation/communication overlap, pipeline bubbles, memory bandwidth utilization, collective efficiency
  • Statistical methods for performance analysis: regression detection, confidence intervals, A/B comparison at scale
  • Understanding of how the full software stack impacts performance: driver overhead, collective algorithm selection, memory allocation, scheduling, firmware power management
  • Strong data analysis and visualization skills (Python, pandas, dashboards). Customer obsession - genuine passion for understanding why customers aren't achieving expected performance and driving solutions
  • Ability to communicate performance findings to both deep technical audiences and executive leadership
  • Demonstrated success influencing multiple engineering teams to prioritize performance improvements


Ways to stand out from the crowd:
  • Experience profiling and optimizing distributed training at 1000+ GPU scale (Megatron-LM, DeepSpeed, FSDP)
  • Background in ML infrastructure performance at a CSP/hyperscaler
  • Familiarity with NVIDIA platforms (DGX, HGX, NVLink topology) and profiling tools
  • Experience building automated performance regression detection systems for production environments
  • Understanding of inference workload performance dynamics (vLLM, TensorRT-LLM, SGLang, continuous batching)


Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until June 30, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

About NVIDIA Corporation

Nvidia, a global leader in graphics, gaming, and AI technology, offers Nvidia careers and internship opportunities for those passionate about driving innovation in the tech industry. you'll find a company committed to growth, teamwork, and leadership in computer science and machine learning domains.

About Nvidia

A Pioneer in Technology and Innovation

Nvidia has cemented its reputation as a powerhouse in developing advanced graphics processing units (GPUs) and has significantly contributed to the gaming industry's evolution. Moreover, its foray into AI and machine learning has opened new frontiers in technology, making Nvidia a beacon of innovation and a desirable workplace for ambitious tech professionals.

Job Opportunities

Diverse Positions in a Dynamic Field

Nvidia is continuously on the lookout for talented individuals across various domains, including hardware and software engineering, product design, marketing, and sales. Employment opportunities at Nvidia are vast, catering to a wide range of expertise and career aspirations.

Employment in Hardware and Graphics

For those fascinated by the intricacies of hardware and graphics technology, Nvidia offers positions that sit at the forefront of gaming and computing advancements.

Growth in Machine Learning and AI

Nvidia's leadership in AI and machine learning has created numerous vacancies for specialists eager to contribute to groundbreaking projects.

Recruitment in Computer Science

With the constant demand for innovation, Nvidia's recruitment efforts focus on computer science experts capable of pushing the boundaries of what's possible.

Internship Program

Opening Doors to Future Innovators

Nvidia's internship program is designed to nurture the next generation of technology leaders, offering hands-on experience in a culture that celebrates creativity and teamwork.

Benefits and Culture

Interns at Nvidia enjoy a plethora of benefits, from competitive stipends to mentorship opportunities, all within an environment that values growth and learning.

Opportunities for Students

Whether you're an undergraduate, a master's student, or a Ph.D. candidate, Nvidia's internships provide a real-world glimpse into the tech industry, offering valuable experience in various technology fields.

Pathways to Full-Time Employment

Many interns have transitioned into full-time positions, marking the start of successful careers at Nvidia. The internship program is more than a stepping stone into the company; it’s an investment in the professional development of interns. The goal is to ensure that interns are well-equipped for future challenges.

Nvidia Careers: More Than Just a Job

Nvidia offers more than just a job to its employees; it provides a front-row seat on the journey into the future of technology. Nvidia stands as a pillar of innovation with its vast opportunities in hardware, graphics, gaming, machine learning, and computer science. Nvidia careers serve as a launching pad for talented workers who aim to redefine the technological landscape. Whether through full-time positions or internships, joining Nvidia means contributing to a legacy of breakthroughs and becoming part of a global community dedicated to pushing the boundaries of what's possible.
Learn more about NVIDIA Corporation
Size
22,473 employees
Market Cap
$350.4 billion
Industry
Net Income
$4.3 billion
Founded
1993
5 Year Trend
+31.3%
Revenue
$16.6 billion
NASDAQ

Similar Jobs

More Jobs at NVIDIA Corporation

More Information Technology Jobs

Find similar Principal Software Engineer, E2E Performance and Goodput - CSP Engagements jobs: