Member of Technical Staff, Kernel Engineering

Inferact

• $200K — $400K *

San Francisco, CA 94112In-Person

Consumer Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree or equivalent experience in computer science, engineering, or similar.
Deep experience writing CUDA kernels or equivalent (CuTeDSL, Triton, TileLang, Pallas).
Strong understanding of GPU architecture: memory hierarchy, warp scheduling, tiling, tensor cores.
Proficiency in C++ and Python with demonstrated ability to write high-performance code.
Experience with profiling tools (Nsight, rocprof) and performance optimization methodologies.
Obsession with benchmarks and squeezing every percentage point of speedup.

Responsibilities

Write kernels and low-level optimizations for the vLLM inference engine.
Optimize code to achieve maximum performance across a wide array of hardware accelerators.
Collaborate directly with hardware vendors on performance extraction for new chip generations.
Analyze and optimize GPU architecture and memory usage for efficiency.
Utilize profiling tools to identify bottlenecks and implement performance improvements.

Benefits

Generous health, dental, and vision benefits.
401(k) company match.

Full Job Description

About the Role

We're looking for a performance engineer to squeeze every FLOP out of modern accelerators. You'll write the kernels and low-level optimizations that make vLLM the fastest inference engine in the world. Your code will run on hundreds of accelerator types, from NVIDIA GPUs to emerging silicon. When hardware vendors develop new chips, they integrate with vLLM. You'll work directly with these teams to ensure we're extracting maximum performance from every generation of hardware.

Skills and Qualifications

Minimum qualifications:

Bachelor's degree or equivalent experience in computer science, engineering, or similar.
Deep experience writing CUDA kernels or equivalent (CuTeDSL, Triton, TileLang, Pallas).
Strong understanding of GPU architecture: memory hierarchy, warp scheduling, tiling, tensor cores.
Proficiency in C++ and Python with demonstrated ability to write high-performance code.
Experience with profiling tools (Nsight, rocprof) and performance optimization methodologies.
Obsession with benchmarks and squeezing every percentage point of speedup.

Preferred qualifications:

Experience with ML-specific kernel optimization (FlashAttention, fused kernels).
Knowledge of quantization techniques (INT8, FP8, mixed-precision).
Familiarity with multiple accelerator platforms (NVIDIA, AMD, TPU, Intel).
Experience with compiler technologies (LLVM, MLIR, XLA).

Bonus points if you have:

Kernel-related contributions to vLLM or other inference engine projects.
Contributions to open-source GPU, ML systems, or compiler optimization projects
Written deep technical blogs on GPU optimization.

Logistics

Location: This role is based in San Francisco, California. Will consider remote in the US for exceptional candidates.
Compensation: Depending on background, skills, and experience, the expected annual salary range for this position is $200,000 - $400,000 USD + equity.
Visa sponsorship: We sponsor visas on a case-by-case basis.
Benefits: Inferact offers generous health, dental, and vision benefits as well as 401(k) company match.

* Ladders Estimates

Similar Jobs

Embedded AI Engineer - Android Automotive (On-Device Intelligence)
$150K — $250K *
Applied Intuition
Sunnyvale, CA 94087 (Santa Clara County)
Today
C++ Software Engineer, Amazon Music, App Foundry
$165K — $223K *
Amazon
Sunnyvale, CA 94087 (Santa Clara County)
Reposted Today
Alarm & Accessories - Embedded SDE , Ring
$165K — $223K *
Amazon
Sunnyvale, CA 94087 (Santa Clara County)
Reposted Yesterday
Systems Software Engineer - Marvis Minis & Edge AI
$120K — $243K *
Hewlett Packard Enterprise Development LP
Cupertino, CA 95014 (Santa Clara County)
Reposted Yesterday
Alarm & Accessories - Embedded SDE , Ring
$165K — $223K *
Amazon
Sunnyvale, CA 94087 (Santa Clara County)
Reposted Yesterday
Senior 3D Artist, Technical
$163K — $200K *
ROBLOX Corporation
San Mateo, CA 94403 (San Mateo County)
Reposted 2 days ago

Get Ready For Your
Next Interview

More Jobs at Inferact

Member of Technical Staff, Cloud Orchestration
$200K — $400K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Member of Technical Staff, Kernel Engineering
$200K — $400K *
San Francisco, CA 94112 (San Francisco County)
Today
Consumer Technology
In-Person
Member of Technical Staff, Cluster Administration
$200K — $400K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Member of Technical Staff, Performance and Scale
$200K — $400K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Member of Technical Staff, Inference
$200K — $400K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person

More Consumer Technology Jobs

Sr. Manager, Brand Management
$120K — $150K *
Herbalife
Los Angeles, CA 90011 (Los Angeles County)
Today
Principal Engineer, Product Development Engineering
$130K — $180K *
Sandisk
Folsom, CA 95630 (Sacramento County)
Reposted Today
Head of Growth
$120K — $180K *
The Path
San Francisco, CA 94112 (San Francisco County)
Today
AI Researcher: Multimodal Understanding
$120K — $180K *
Mirror Physics Corporation
New York, NY 10025 (New York County)
Today
GPU Design Engineer - Memory Hierarchy
$130K — $180K *
Apple
Santa Clara, CA 95051 (Santa Clara County)
Reposted Today

Find similar Member of Technical Staff, Kernel Engineering jobs:

Nationwide San Francisco, CA

Member of Technical Staff, Kernel Engineering

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Member of Technical Staff, Kernel Engineering jobs:

Get Ready For Your
Next Interview