Amazon

AI Inference Engineer - Model Optimization & Deployment

Amazon$242K — $290K *
Transportation
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Deep expertise in model quantization (PTQ, QAT) and mixed-precision workflows.
  • Proven experience optimizing large-scale models (LLMs, VLMs) for efficiency.
  • Extensive experience with TensorRT for model conversion and benchmarking.
  • Proficiency in low-level programming for AI accelerators, including CUDA and TensorRT Plugins.
  • Production-level skills in C++ and Python for real-time inference code.

Responsibilities

  • Optimize large-scale models using advanced quantization and fine-tuning techniques.
  • Architect model conversion and compilation pipelines for edge deployment.
  • Conduct parity checking and latency benchmarking between frameworks and edge binaries.
  • Write and optimize custom CUDA kernels for AI accelerators.
  • Develop production-level concurrent C++ and Python code for vehicle SOCs.

Benefits

  • Paid time off including sick leave and vacation.
  • Health insurance and long-term care insurance.
  • Long-term and short-term disability insurance.
  • Life insurance and unpaid time off.
  • Zoox Stock Appreciation Rights and Amazon RSUs.
Full Job Description
The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence.

As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.

In this role, you will:

  • Optimize large-scale models (LLMs, VLMs) using advanced quantization (PTQ, QAT), mixed-precision inference workflows, and parameter-efficient fine-tuning (LoRA, QLoRA).
  • Architect and implement model conversion and compilation pipelines using TensorRT and TensorRT-LLM for edge deployment.
  • >
  • Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
  • >
  • Write and optimize custom CUDA kernels and TensorRT Plugins to maximize memory bandwidth and minimize latency on AI accelerators.
  • >
  • Write production-level, highly concurrent, and memory-safe C++ and Python code for real-time inference on vehicle SOCs.
  • >


Qualifications:

  • Deep expertise in model quantization (PTQ, QAT) and mixed-precision inference workflows (INT8, FP8, INT4, BF16/FP16).
  • >
  • Proven experience optimizing large-scale models (LLMs, VLMs, or VLAs) utilizing KV-cache optimization (e.g., PagedAttention), Speculative Decoding, and Efficient Attention mechanisms (FlashAttention, Linear Attention).
  • >
  • Extensive experience with model conversion/compilation pipelines (TensorRT, TensorRT-LLM) and performing rigorous parity/latency benchmarking.
  • >
  • Proficiency in low-level programming for AI accelerators, specifically writing and optimizing custom CUDA kernels and TensorRT Plugins.
  • >
  • Production-level C++ (14/17/20) and Python programming skills, with experience writing concurrent, memory-safe, real-time inference code for edge devices.


Bonus Qualifications:

  • Experience with distributed training pipelines and model/tensor parallelism (PyTorch Distributed, Ray, DeepSpeed, Megatron-LM) and runtime efficiency optimization for GPU clusters.
  • >
  • Familiarity with autonomous driving perception stacks (temporal 3D object detection, BEV, 3D Occupancy Networks) and processing multi-modal sensor streams (Vision, LiDAR, Radar).
  • >
  • Understanding of end-to-end autonomous driving paradigms (VLA models, closed-loop simulation validation).


$242,000 - $290,000 a year

Base Salary Range

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

About Amazon

Audible is a provider of spoken audio information and entertainment , on the Internet. They provide premium spoken audio content, such as audio versions of books and newspapers and radio programs, that is delivered over the Internet and played back on personal computers and hand-held electronic devices. The Audible service allows consumers to purchase and download their content from their Website, store it in digital files and play it back on personal computers and electronic devices. More than 15,000 hours of audio content are available on their Web site, including audio versions of books, periodicals and radio programs. Several manufacturers have agreed to support and promote the playback of their content on their hand-held audio-enabled electronic devices.

Amazon Careers

Joining Amazon presents an unparalleled opportunity to become part of a vibrant team pushing the boundaries of innovation and growth in the global marketplace. As a leader in e-commerce, technology, and logistics, Amazon offers a variety of job opportunities that cater to a range of skills and professional interests. Work You’ll Do At Amazon, every day is an opportunity to collaborate with the brightest minds in technology and business to redefine what’s possible. Whether you’re interested in software development, marketing, human resources, or customer service, Amazon has a position waiting for you. Transform the way the world shops and innovates with our diverse and inclusive team. Amazon is not just a company; it’s a community where you can drive real change and contribute to projects impacting millions globally. Lead with Innovation and Leadership Amazon is the perfect place to enhance your leadership and innovation skills. Our culture encourages pushing the envelope and imagining the unimaginable. Here, you will lead projects that challenge the status quo and define new industry standards. Work with a team that values diversity and is committed to creating an inclusive environment. Our leadership is focused on harnessing the collective power of unique perspectives to foster growth and innovation. Explore Amazon’s Employment Benefits Amazon’s commitment to its employees extends beyond just career growth. We offer competitive benefits, including health care, parental leave, and diversity training, ensuring that our team not only excels professionally but also enjoys well-being and security. Internship and Networking Opportunities Start your career with an Amazon internship and gain hands-on experience that matters. Our internships provide a gateway to full-time employment and an opportunity to network with professionals across various sectors of the company. Future-Proof Your Career With Amazon, your career path is filled with numerous opportunities for advancement. Our learning and development programs are designed to nurture your professional growth and keep you at the forefront of industry trends. Stay Connected Join Our Team Discover the job opportunities at Amazon that match your skills and interests. We are constantly on the lookout for passionate, curious, and innovative team players ready to make a difference. Keep Up to Date Stay ahead with career tips, insider perspectives, and industry-leading insights you can put to use today—all from the people who work here. Job Alert Emails Customize your subscription to receive job alerts, the latest news, and insider tips tailored to your preferences. Explore the exciting and rewarding career opportunities that await at Amazon. Amazon is more than just a company—it’s a platform for building a promising future. Whether you’re starting or looking to advance your career, Amazon offers the resources, support, and network you need to succeed. Join us, and be a part of our continuing mission to be Earth's most customer-centric company.
Learn more about Amazon
Size
1,608 employees
Market Cap
$832.6 billion
Industry
Net Income
$21.3 billion
Founded
1994
5 Year Trend
+28.1%
Revenue
$386 billion
NASDAQ

Similar Jobs

More Jobs at Amazon

More Transportation Jobs

Find similar AI Inference Engineer - Model Optimization & Deployment jobs: