Strategic Technical Account Manager GPU

Vultr

$115K — $140K *
US-AnywhereRemote in United States
Technical Services
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 2-5+ years in AI/ML Engineering, Technical Account Management, or relevant technical roles.
  • Strong understanding of GPU hardware (NVIDIA/AMD) and ML frameworks.
  • Experience with Linux tuning and advanced networking fabrics (Infiniband, RoCE).
  • Familiarity with high-performance storage systems (DDN, NetApp, etc.).
  • Excellent communication skills for technical concepts to various audiences.
  • Prior experience with hyperscale deployments is advantageous.
  • Cloud Native Computing Foundation Certified Kubernetes Administrator certification is preferred.

Responsibilities

  • Lead onboarding for customers using GPU clusters in various configurations.
  • Advise on optimal cluster design and networking/storage requirements.
  • Guide customers in GPU selection based on specific workloads.
  • Support various distributed training frameworks and tools.
  • Identify and resolve performance bottlenecks across systems.
  • Manage technical relationships and strategies with key accounts.
  • Coordinate incident management and resolve technical issues.

Benefits

  • Opportunity to engage with cutting-edge GPU technology and AI applications.
  • Collaborative work environment with technical experts.
  • Direct influence on customer success in high-performance workloads.
  • Participation in shaping product features and offerings.
  • Exposure to diverse AI and cloud-native technologies.
Full Job Description
Join Vultr

The GPU-focused Technical Account Manager (TAM) leads the post-sales technical success of customers deploying large-scale AI, training, inference, and high-performance GPU workloads on the company's platform. This includes customers using NVIDIA GPU clusters, AMD GPU clusters, GPU VMs, and rack-scale bare-metal environments.

You will act as a trusted advisor across LLM training, fine-tuning, RAG workloads, distributed training frameworks, storage throughput requirements, multi-GPU scaling, and performance tuning. This role requires deep technical fluency and exceptional customer management skills to help AI/ML teams achieve predictable, cost-efficient, high-performance outcomes.

Key Responsibilities

AI/GPU Onboarding & Workload Architecture
  • Lead onboarding for customers deploying GPU clusters (bare metal, VMs, or hybrid).
  • Advise on cluster design: multi-GPU topology, NVLink/NVSwitch considerations, RDMA, Infiniband and RoCE Ethernet, networking throughput, and storage IOPS requirements.
  • Guide customers in selecting GPU types and configurations based on workload (training, fine-tuning, inference, embeddings, RAG pipelines).
  • Support distributed frameworks: PyTorch, TensorFlow, DeepSpeed, Megatron, JAX, Ray, Mosaic, HuggingFace, etc.
  • Advanced hands on Kubernetes skills
  • Advanced hands on SLURM skills

Performance Optimization & Scaling
  • Identify bottlenecks (network, storage, memory bandwidth).
  • Provide tuning recommendations for batch size, mixed precision, parallelization strategies, and checkpointing.
  • Help customers evaluate cost vs. performance tradeoffs (GPU mix, CPU pairing, instance types, cluster sizing).

Technical Relationship Ownership
  • Own the long-term technical strategy across assigned GPU/AI accounts, including hyperscalers, labs, and high-growth AI startups.
  • Host recurring technical review meetings, roadmap reviews, and optimization sessions.
  • Define scaling plans, future GPU reservation needs, and capacity forecasting.
  • Incident & Escalation Management
  • Partner with Support, SRE, Networking, NOC, and Product Management & Engineering to resolve high-urgency incidents.
  • Manage outage communications, corrective action plans, and postmortem reviews with customers.
  • Advocate for GPU reliability improvements and influence roadmap priorities.

Account Growth & Expansion
  • Identify opportunities for expanded clusters, high speed storage, or networking upgrades.
  • Support Sales with technical validation and architecture diagrams needed for expansion.

Customer Advocacy & Product Feedback
  • Provide structured feedback on existing and future GPU offerings, networking fabrics, storage platforms, and upcoming AI/ML platform features.
  • Partner with Product on early access programs (new GPUs, pipelines, orchestration, etc.).


Qualifications
  • 2-5+ years as an AI/ML Engineer, AI/ML Ops, Technical Account Manager, HPC Engineer, Sales/Solutions Engineer or relevant technical role.
  • Strong knowledge of GPU hardware architectures (NVIDIA/AMD), CUDA/ROCm, distributed training, and ML frameworks.
  • Experience with Linux tuning, networking (Infiniband, RoCE fabrics).
  • Experience with high-performance storage systems (DDN, NetApp, Vast, Weka, etc.).
  • Ability to communicate complex concepts clearly to both executives and engineering teams.
  • Prior experience supporting hyperscale, AI labs, or large cluster deployments is a plus.
  • Cloud Native Computing Foundation Certified Kubernetes Administrator (CKA) certification is a plus.

Compensation

$115,000 - $140,000

This salary can vary based on location, years of experience, background and skill set.

Similar Jobs

More Jobs at Vultr

More Technical Services Jobs

Find similar Strategic Technical Account Manager GPU jobs: