Infrastructure Engineer - Virualization

TensorWave

$90K — $130K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 4-7+ years in infrastructure or platform operations
  • Hands-on experience with Linux-based virtualization platforms like KVM/QEMU and Proxmox
  • Strong knowledge of Linux systems including process management and networking
  • Ability to troubleshoot resource contention and I/O bottlenecks
  • Familiar with VM lifecycle and resource allocation
  • Experience with automation tools like Ansible
  • Capability to manage incidents effectively during production issues

Responsibilities

  • Operate and maintain large-scale virtualization environments
  • Manage the full lifecycle of virtual machines
  • Monitor and respond to platform health issues
  • Troubleshoot and resolve issues across various layers
  • Execute infrastructure changes safely and effectively
  • Utilize automation tools for standardizing deployments
  • Collaborate with cross-functional teams for optimal performance

Benefits

  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance
  • Company Health Savings Account Contributions
  • 100% paid Short and Long Term Disability Insurance
  • Life and Voluntary Supplemental Insurance Options
  • Various Supplementary Health Benefits
  • Flexible Spending Account
  • 401(k)
  • Employee Assistance Program
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Other In-Office Perks
Full Job Description
About the Role

We are building and operating large-scale infrastructure platforms to support high-performance AI workloads across multiple data centers. Our environment includes GPU-intensive systems, high-throughput networking, and rapidly scaling compute clusters.

We are looking for a Virtualization Operations Engineer to focus on the day-to-day operation, stability, and performance of our virtualization platforms. This role is responsible for ensuring that our hypervisor environments are reliable, performant, and scalable as we continue to grow.

This is a hands-on operations role working across hypervisors, virtual machines, and underlying infrastructure systems.

What You'll Do
  • Operate and maintain large-scale virtualization environments (Proxmox and/or KVM-based systems)
  • Manage the full lifecycle of virtual machines: provisioning, configuration, migration, decommissioning
  • Monitor and respond to platform health issues, including host failures, VM performance degradation, resource contention (CPU, memory, disk, network)
  • Troubleshoot and resolve issues across hypervisors, guest operating systems, storage and networking layers
  • Execute infrastructure changes safely, including cluster expansions, host maintenance and upgrades, configuration updates
  • Work with automation tools to standardize deployments, reduce manual intervention, improve operational consistency
  • Collaborate with DevOps (automation and platform tooling), Network Engineering (connectivity and performance), Storage Engineering (I/O performance and reliability)
  • Participate in incident response and root cause analysis
  • Contribute to runbooks, documentation, and operational best practices


Who You Are

Required Qualifications
  • 4-7+ years of experience in infrastructure, systems, or platform operations
  • Hands-on experience operating Linux-based virtualization platforms, such as KVM/QEMU, Proxmox, VMware (with strong Linux fundamentals)
  • Strong Linux systems knowledge, including process management, networking, disk and filesystem management
  • Experience troubleshooting CPU and memory contention, disk I/O bottlenecks, network performance issues
  • Familiarity with virtualization concepts: VM lifecycle, resource allocation, live migration
  • Experience with infrastructure automation tools (e.g., Ansible or similar)
  • Ability to work effectively during incidents and production issues

Preferred Qualifications
  • Experience operating infrastructure at scale (100+ hosts)
  • Familiarity with GPU-based systems or high-performance workloads, NUMA awareness and performance tuning
  • Exposure to high-throughput networking (bonding, VLANs, SR-IOV), distributed or high-performance storage systems
  • Experience working alongside Kubernetes or container platforms
  • Experience in cloud or CSP environments


What We Offer
  • Stock Options
  • 100% paid Medical, Dental, and Vision insurance for Employees
  • Company Health Savings Account Contributions
  • 100% paid Short Term and Long Term Disability Insurance for Employees
  • Life and Voluntary Supplemental Insurance Options
  • Other Insurance Options, such as Pet & Legal Insurance
  • Various Supplementary Health Benefits, such as discounted Virtual Healthcare Appointments and Serious Illness Support
  • Flexible Spending Account
  • 401(k)
  • Employee Assistance Program
  • Flexible PTO
  • Paid Holidays
  • Parental Leave
  • Other In-Office Perks


More Jobs at TensorWave

More Information Technology Jobs

Find similar Infrastructure Engineer - Virualization jobs: