Staff AI Inference and Acceleration Engineer

Figure AI

• $180K — $275K *

San Jose, CA 95123In-Person

Information Technology

8 - 10 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

M.S. or Ph.D. in relevant fields or equivalent experience
8+ years in hardware acceleration or ML systems
Expertise in AI/ML inference models and pipelines
Experience optimizing models for edge hardware
Solid understanding of computer architecture and data movement
Proficient in low-level tools like TensorRT and JAX
Strong programming skills in C++ and Python

Responsibilities

Own the on-board inference architecture for humanoid robots
Map models to compute accelerators based on performance constraints
Partition workloads across various compute resources
Define compute budgets for inference tasks
Evaluate new acceleration hardware and roadmap
Optimize inference pipelines from model export to execution
Identify and resolve performance bottlenecks in inference
Collaborate with AI/ML teams to ensure hardware-friendly model design
Integrate runtime, scheduling, and power management into platform software
Engage with hardware vendors to influence future designs

Benefits

Comprehensive health benefits
401(k) plan with company matching
Flexible working hours
Professional development opportunities
Work in an innovative, cutting-edge field

Full Job Description

We are looking for a Staff AI Inference & Acceleration Engineer to join the Platform Software team and own the on-board inference architecture for Figure's humanoid robots. You will be the technical authority on how AI workloads are mapped, optimized, and executed across the robot's compute hardware - driving down power consumption and cost while meeting the strict latency and reliability demands of a real-time autonomous system. Responsibilities: • Own the on-board inference architecture - mapping models to available accelerators (NPU, GPU, DSP, CPU) based on latency, power, and memory budgets. • Partition inference workloads across heterogeneous compute resources, balancing real-time performance with power and thermal constraints. • Define and maintain a system-level compute budget across all inference tasks running on the robot. • Evaluate next-generation acceleration hardware and contribute to the definition of future compute platform requirements. • Optimize inference toolchains end-to-end - from model export through runtime execution - for target hardware. • Apply quantization (INT8, INT4, mixed-precision), pruning, operator fusion, and other compression techniques to reduce compute, memory, and power footprint. • Profile inference pipelines to identify and eliminate bottlenecks in latency, memory bandwidth, and power consumption. • Optimize kernel scheduling, memory layout, and data movement across the compute hierarchy. • Partner closely with the AI/ML team to define model architecture constraints that are hardware-friendly from the outset. • Work with the Platform Software team on runtime integration, scheduling, and power management. • Engage with silicon vendors and research teams to track the accelerator landscape and influence hardware roadmaps. Requirements: • M.S. or Ph.D. in Computer Engineering, Electrical Engineering, Computer Science, or a related field - or equivalent industry experience. • At least 8 years of industry experience in hardware acceleration, ML systems, or compute architecture. • Deep understanding of AI/ML inference - model formats (ONNX, TFLite, etc.), inference runtimes, and deployment pipelines. • Hands-on experience optimizing models for edge or embedded hardware using quantization, pruning, and operator-level tuning. • Strong understanding of computer architecture - memory hierarchies, data movement, and heterogeneous compute. • Experience profiling and benchmarking inference workloads across CPU, GPU, NPU, DSP. • Familiarity with low-level toolchains and compilation frameworks (e.g. TVM, MLIR, TensorRT, Torch, SNPE/QNN, JAX, CUDA, ROCm). • Solid software engineering skills in C++ and Python. • Strong cross-functional communication skills - able to work effectively across hardware, software, and AI/ML teams. Bonus Qualifications: • Knowledge of real-time operating constraints and their impact on inference scheduling. • Track record of co-designing model architectures with ML teams to meet hardware constraints. The US base salary range for this full-time position is between $180,000 - $275,000 annually. The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.

* Ladders Estimates

Similar Jobs

Sr. Performance Modeling Architect
$100K — $500K *
Tenstorrent
Santa Clara, CA 95051 (Santa Clara County)
Today
Sr Applied Scientist, ML Codesign, Edge AI Platform
$192K — $260K *
Amazon
Sunnyvale, CA 94087 (Santa Clara County)
Yesterday
Research Scientist, AI & Systems Co-design (PhD)
$130K — $180K *
Meta
Menlo Park, CA 94025 (San Mateo County)
2 days ago
Systems Architect
$136K — $258K *
NVIDIA Corporation
Santa Clara, CA 95051 (Santa Clara County)
2 days ago
Senior Architecture Modeling Engineer, AWS Machine Learning Accelerators
$193K — $261K *
Amazon
Cupertino, CA 95014 (Santa Clara County)
Reposted 3 days ago
Lead ASIC Design Engineer
$143K — $230K *
Broadcom
San Jose, CA 95123 (Santa Clara County)
1 week ago

Get Ready For Your
Next Interview

More Jobs at Figure AI

Staff AI Inference and Acceleration Engineer
$180K — $275K *
San Jose, CA 95123 (Santa Clara County)
Today
Information Technology
In-Person
Deployment Logistics Coordinator
$104K — $124K *
San Jose, CA 95123 (Santa Clara County)
Reposted Today
Transportation
In-Person
Firmware Intern [Fall 2026]
$83K — $93K *
San Jose, CA 95123 (Santa Clara County)
4 days ago
Technical Services
In-Person
Finance Manager
$200K — $260K *
San Jose, CA 95123 (Santa Clara County)
4 days ago
Finance & Insurance
In-Person
Software Engineer, Privacy & Data Governance
$150K — $350K *
San Jose, CA 95123 (Santa Clara County)
5 days ago
Information Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
1 week ago
Sr. SDET (Selenium/Automation)
$112K — $154K *
loanDepot
Plano, TX 75025 (Collin County)
Reposted Today
Engineering Manager, Cloud Platform - nRF Cloud
$141K — $218K *
Nordic Semiconductor
Boston, MA 02115 (Suffolk County)
Reposted Today
Senior Software Developer, API
$143K — $199K *
Spokeo
Remote
Today
Sr. Software Engineer
$170K — $260K *
Pantomath
San Francisco, CA 94112 (San Francisco County)
Reposted Today

Find similar Staff AI Inference and Acceleration Engineer jobs:

Nationwide San Jose, CA

Staff AI Inference and Acceleration Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Staff AI Inference and Acceleration Engineer jobs:

Get Ready For Your
Next Interview