NVIDIA Corporation

Senior System Software Engineer - DevOps and Infrastructure Automation

NVIDIA Corporation$184K — $287K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • BS/MS in Computer Science/Computer Engineering or equivalent experience, plus 7+ years of operating production distributed systems (SRE/DevOps/Platform Ops).
  • Deep Kubernetes expertise, including components, subsystems, and hands-on debugging.
  • Strong skills in CI/CD (GitLab CI, GitHub Actions) and proficiency with Linux systems programming and scripting in Python and Bash.
  • Fluency in Infrastructure as Code (Terraform, Ansible, Helm, Crossplane) and containerization (Docker, containerd, OCI).
  • Proven experience in reliability ownership, including SLOs/SLIs and incident response.

Responsibilities

  • Design, build, and operate the infrastructure backbone for AI inference products.
  • Own end-to-end Kubernetes deployments across cloud and on-prem environments.
  • Architect CI/CD pipelines for automated build and release of inference libraries and container-based software stacks.
  • Implement observability solutions for platform health, including dashboards and automated checks.
  • Manage environments using infrastructure-as-code tools and reduce toil through automation.
  • Oversee the security posture of infrastructure components including vulnerability scans and compliance.
  • Collaborate with deep learning framework engineers to streamline deployment processes.

Benefits

  • Equity participation in addition to salary.
  • Comprehensive benefits package.
  • Opportunities for professional growth and development.
  • Work in a leading technology company with innovative projects.
  • Access to cutting-edge AI tools and technology.
Full Job Description
Become a Senior System Software Engineer on NVIDIA's AI Inference Operations Team, focusing on DevOps and Infrastructure Automation. You will be working alongside a team of passionate and skilled engineers who are continuously building better tools to deploy and manage this infrastructure. With your help, we will forge the next generation of compute infrastructure. If you thrive at the intersection of systems programming, cloud-native infrastructure, and developer productivity, this is your opportunity to make a lasting impact at a leading technology company.

What you'll be doing:
  • Design, build, and operate the infrastructure backbone powering AI inference products - reliable, performant, and scalable at every layer!
  • Own Kubernetes deployments end-to-end across cloud and on-prem: runbooks, canary checks, post-deploy validation, and rollbacks when needed.
  • Architect CI/CD pipelines for automated build, test, packaging, and release of inference libraries and their container-based software stacks.
  • Build observability that actually tells the truth about platform health - dashboards, logs, metrics, automated checks - and lead first-level incident triage with clean, actionable handoffs to engineering.
  • Manage cloud and on-prem environments with infrastructure-as-code (Terraform, Ansible, Helm, Crossplane), and chip away at toil using GitHub Actions, GitLab CI, and custom tooling.
  • Own the security posture for infrastructure components: vulnerability scans, CVE remediation, and compliance with internal policies.
  • Collaborate closely with deep learning framework engineers, compiler teams, and platform architects to streamline end-to-end deployment!


What we need to see:
  • BS/MS in CS/CE or equivalent experience, plus 7+ years operating production distributed systems (SRE / DevOps / Platform Ops).
  • Deep Kubernetes expertise - components, subsystems, on-prem setup, and hands-on debugging of telemetry-heavy microservices across AWS, Azure, GCP, and on-prem.
  • Strong CI/CD chops (GitLab CI, GitHub Actions), Git-based workflows, Linux systems programming, and scripting in Python and Bash.
  • IaC fluency (Terraform, Ansible, Helm, Crossplane) and containerization depth (Docker, containerd, OCI).
  • Proven reliability ownership - SLOs/SLIs, on-call, incident response, and post-incident reviews that drive measurable improvements - backed by hands-on experience with observability stacks like Prometheus, Grafana, and Loki.
  • A clear communicator who writes runbooks people actually use!


Ways to stand out from the crowd:
  • MLOps experience - crafting, deploying, and operating machine learning pipelines end to end.
  • Experience in open-source development workflows and community engagement on projects like Triton Inference Server or ONNX Runtime.
  • Familiarity with GPU software stacks - CUDA, cuDNN, TensorRT, and inference serving frameworks.
  • Experience building custom test automation frameworks and using data-driven metrics to improve platform health and developer efficiency.
  • Demonstrated ability to debug complex issues spanning kernel modules, container runtimes, and distributed networking.


Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until June 12, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

About NVIDIA Corporation

Nvidia, a global leader in graphics, gaming, and AI technology, offers Nvidia careers and internship opportunities for those passionate about driving innovation in the tech industry. you'll find a company committed to growth, teamwork, and leadership in computer science and machine learning domains.

About Nvidia

A Pioneer in Technology and Innovation

Nvidia has cemented its reputation as a powerhouse in developing advanced graphics processing units (GPUs) and has significantly contributed to the gaming industry's evolution. Moreover, its foray into AI and machine learning has opened new frontiers in technology, making Nvidia a beacon of innovation and a desirable workplace for ambitious tech professionals.

Job Opportunities

Diverse Positions in a Dynamic Field

Nvidia is continuously on the lookout for talented individuals across various domains, including hardware and software engineering, product design, marketing, and sales. Employment opportunities at Nvidia are vast, catering to a wide range of expertise and career aspirations.

Employment in Hardware and Graphics

For those fascinated by the intricacies of hardware and graphics technology, Nvidia offers positions that sit at the forefront of gaming and computing advancements.

Growth in Machine Learning and AI

Nvidia's leadership in AI and machine learning has created numerous vacancies for specialists eager to contribute to groundbreaking projects.

Recruitment in Computer Science

With the constant demand for innovation, Nvidia's recruitment efforts focus on computer science experts capable of pushing the boundaries of what's possible.

Internship Program

Opening Doors to Future Innovators

Nvidia's internship program is designed to nurture the next generation of technology leaders, offering hands-on experience in a culture that celebrates creativity and teamwork.

Benefits and Culture

Interns at Nvidia enjoy a plethora of benefits, from competitive stipends to mentorship opportunities, all within an environment that values growth and learning.

Opportunities for Students

Whether you're an undergraduate, a master's student, or a Ph.D. candidate, Nvidia's internships provide a real-world glimpse into the tech industry, offering valuable experience in various technology fields.

Pathways to Full-Time Employment

Many interns have transitioned into full-time positions, marking the start of successful careers at Nvidia. The internship program is more than a stepping stone into the company; it’s an investment in the professional development of interns. The goal is to ensure that interns are well-equipped for future challenges.

Nvidia Careers: More Than Just a Job

Nvidia offers more than just a job to its employees; it provides a front-row seat on the journey into the future of technology. Nvidia stands as a pillar of innovation with its vast opportunities in hardware, graphics, gaming, machine learning, and computer science. Nvidia careers serve as a launching pad for talented workers who aim to redefine the technological landscape. Whether through full-time positions or internships, joining Nvidia means contributing to a legacy of breakthroughs and becoming part of a global community dedicated to pushing the boundaries of what's possible.
Learn more about NVIDIA Corporation
Size
22,473 employees
Market Cap
$350.4 billion
Industry
Net Income
$4.3 billion
Founded
1993
5 Year Trend
+31.3%
Revenue
$16.6 billion
NASDAQ

Similar Jobs

More Jobs at NVIDIA Corporation

More Information Technology Jobs

Find similar Senior System Software Engineer - DevOps and Infrastructure Automation jobs: