Staff AI/ML Infrastructure Engineer

Vultr

$145K — $160K *
US-AnywhereRemote in United States
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years with bare metal infrastructure and hardware automation
  • Hands-on experience with modern NVIDIA/AMD GPU platforms
  • Deep knowledge of BIOS, BMC, firmware, NICs, and PCIe systems
  • Strong Linux systems experience including device drivers
  • Experience building automation using Python and Bash
  • Familiarity with GPU drivers and vendor collaboration
  • Experience designing complex infrastructure products
  • Proven project leadership and mentoring capabilities
  • Experience optimizing multi-cluster GPU environments
  • Exposure to Machine Learning stacks and GPU workloads

Responsibilities

  • Design and maintain GPU and bare metal infrastructure
  • Build scalable GPU clusters with networking teams
  • Ensure reliable provisioning of GPU infrastructure
  • Develop automated testing systems for GPU platforms
  • Implement solutions for diverse AI/ML workloads
  • Benchmark and troubleshoot GPU performance
  • Collaborate with hardware vendors on drivers and support
  • Optimize performance across architectures
  • Lead technical direction and mentor engineers

Benefits

  • Opportunity to build next-gen AI infrastructure
  • Be part of a high-growth technology company
  • Hands-on technical leadership role
  • Collaboration with diverse teams and vendors
  • Focus on operational excellence and innovation
Full Job Description
Join Vultr

Vultr is seeking a highly skilled and experienced Staff AI/ML Infrastructure Engineer to drive the design, performance, and reliability of our AI infrastructure platform. The ideal candidate is a hands-on infrastructure expert with deep GPU systems knowledge, strong automation experience, and a track record of technical leadership in high-performance environments. This is a highly visible role in a high-growth technology company, requiring ownership of complex hardware and software systems, collaboration across engineering and vendor partners, and a relentless focus on operational excellence. This is your opportunity to build the foundation powering next-generation AI workloads and leave a lasting mark on Vultr and the future of cloud infrastructure.

Key Responsibilities
  • Design and maintain GPU and bare metal infrastructure in containerized and physical environments
  • Build scalable GPU clusters in partnership with networking and provisioning teams
  • Ensure reliable, high-performance provisioning of GPU infrastructure
  • Develop automated testing systems for GPU-based platforms
  • Implement infrastructure solutions for diverse AI/ML workloads
  • Benchmark, test, and troubleshoot GPU performance at scale
  • Collaborate with hardware vendors on drivers, firmware, and support
  • Resolve hardware, software, and performance issues across environments
  • Optimize rail and cluster performance across architectures
  • Lead technical direction and mentor engineers on infrastructure best practices


Qualifications
  • 5+ years experience working with bare metal infrastructure and hardware automation
  • Hands-on experience with modern NVIDIA/AMD GPU platforms and high-performance networking (RoCE, InfiniBand)
  • Deep knowledge of BIOS, BMC, firmware, NICs, Redfish/IPMI, and PCIe systems
  • Strong Linux systems experience including device drivers and package management
  • Experience building infrastructure automation using Python and Bash
  • Familiarity with GPU drivers, firmware ecosystems, and vendor collaboration
  • Experience designing and delivering complex infrastructure products
  • Proven ability to lead projects and mentor engineers
  • Experience optimizing multi-cluster GPU environments
  • Exposure to Machine Learning software stacks and GPU workloads


Compensation

$145,000 - $160,000

This salary can vary based on location, years of experience, background and skill set.

Similar Jobs

More Jobs at Vultr

More Information Technology Jobs

Find similar Staff AI/ML Infrastructure Engineer jobs: