Runtime Engineer

MatX

$120K — $475K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Strong systems programming skills in Rust, C, C++, or Go
  • Experience building production-level Python interop layers (e.g., PyO3, pybind11)
  • Proven track record designing and managing API/ABI contracts
  • Knowledge of accelerator programming models like CUDA or TPU
  • Understanding of ML systems and concepts like tensor layouts and collectives

Responsibilities

  • Build host-side interface library including memory management and sync primitives
  • Own the executable format's versioning and evolution
  • Design the custom-kernel ABI for interaction with host-side components
  • Create Python bindings, including C-ABI shim options
  • Develop LLM inference serving stack with request scheduling and token streaming
  • Manage interconnect topology and failure detection across racks
  • Expose chip performance metrics to profilers and ensure performance targets

Benefits

  • Generous equity options and potential for early exercise
  • Comprehensive health, dental, vision, and life insurance
  • Four weeks paid time off plus additional flexible work options
  • Up to 12 weeks of paid parental leave for all paths to parenthood
  • Annual budget for professional development
  • Team-building activities like lunches and quarterly off-sites
  • Robust 401K plan with company contribution
  • Pre-tax flexible spending accounts for various expenses
  • Rideshare cost reimbursement for commuting
  • Monthly allowance for personal perks
  • Support for remote work with tech setup costs covered
Full Job Description
What You'll Do Here
  • Build the host-side interface library - device memory management, DMA, streams and events, sync primitives - that every compiler-emitted program runs on top of
  • Own and extend the executable format: the compilerruntime contract, its versioning, the weight and quantization layouts that let compiler and runtime evolve independently
  • Design the custom-kernel ABI - calling convention, sync semantics, lifecycle - and the host-side marshaling layer (DLPack, the buffer protocol, numpy) that gets Python tensors to the device
  • Build Python bindings via PyO3, with a C-ABI shim as the alternative integration path for downstream consumers
  • Build the LLM inference serving stack - paged KV cache, continuous batching, request scheduling, token streaming - and the cluster orchestration primitives underneath it
  • Bring up interconnect topology from the host and own the failure-detection and clean-teardown path for stop-restructure-resume recovery across racks
  • Design what the chip exposes to host-side profilers and debuggers - perf counters, traces, and the Python surfaces ML engineers actually use - and hit measurable performance targets on runtime overhead and serving throughput
Who You Are
  • Strong experience in a systems programming language - Rust, C, C++, or Go - including memory management, allocator design, and FFI/ABI work
  • Have built Python interop layers in production (PyO3, ctypes, pybind11, or equivalent C-ABI bridging)
  • Have designed and maintained API or ABI contracts between teams - versioning, evolution, breaking-change discipline - not just consumed someone else's
  • Hands-on with at least one accelerator programming model (CUDA, ROCm, oneAPI Level Zero, TPU, or comparable) - enough to reason about device memory, async execution, and kernel launch
  • ML-systems literate - comfortable with the training and inference loop, what collectives do, what a tensor layout is. Research depth not required.
Bonus Points If You Have
  • LLM inference internals - vLLM, TensorRT-LLM, or SGLang (paged attention, scheduler design)
  • Rust at depth, including proc macros, unsafe with soundness reasoning, and complex lifetime/trait work
  • Custom allocator design (slab, paged, arena) or other low-level memory work
  • ML framework integration experience (PyTorch custom backends, JAX/XLA, ONNX runtime)
  • Profiler or tracing infrastructure work (perfetto, Nsight, or a custom stack)
  • Driver-adjacent or kernel-bypass work, or prior new-silicon bring-up
Compensation

The US base salary for this full-time position is determined based on a variety of factors including role, experience, location, job related skills, and relevant education and training. Career length is only a guideline for compensation.
  • Early Career - $120,000 - $250,000 + equity
  • Mid Career - $175,000 - $362,500 + equity
  • Senior Career - $250,000 - $475,000 + equity
What We Offer
  • A Stake in our success Generous equity, with option cash/equity swap at offer, and option to employee early exercise.
  • Health & Wellness Company subsidized Health, Dental, Vision, and Life insurance; Pre-tax Health Savings Accounts with generous company contribution (even if you don't)
  • Time To Recharge 4 weeks paid time off (accrued), 12 company holidays, and 3 weeks remote/flexible work per year
  • Support to Parents Up to 12 weeks of paid parental leave, regardless of your path to parenthood
  • Learning & Development $1,500 yearly towards your professional development e.g. conferences, courses, and other learning opportunities
  • Team Connection Team Lunches, quarterly off-sites, and regular town halls
  • Financial Wellbeing. 401K and/or Roth IRA, with 5% company contribution, even if you don't!
  • Flexible Spending Accounts Pre-tax spend accounts for medical, dental/vision, dependent care, parking, and transit expenses
  • Commute On Us For those commuting up to 1 hour, put your rideshare cost on our company card and reclaim the drive-time to get work done!
  • MatX E[x]tras $50 per month to use on the perks you care about most
  • Remote Perks We work remotely Monday & Friday, supported by home-tech setup, and remote wifi expense reimbursement

Similar Jobs

More Jobs at MatX

  • System Software Engineer
    $120K — $475K *
    Mountain View, CA 94040 (Santa Clara County)
    Telecommunications & Hardware
    In-Person
  • SOC Intergration Engineer
    $120K — $275K *
    Mountain View, CA 94040 (Santa Clara County)
    Technical Services
    In-Person
  • Mechanical and Thermal Reliability Engineer
    $90K — $130K *
    Mountain View, CA 94040 (Santa Clara County)
    Enterprise Technology
    In-Person
  • Runtime Engineer
    $120K — $475K *
    Mountain View, CA 94040 (Santa Clara County)
    Information Technology
    In-Person
  • Emulation Engineer
    $120K — $500K+*
    Mountain View, CA 94040 (Santa Clara County)
    Telecommunications & Hardware
    In-Person

More Information Technology Jobs

Find similar Runtime Engineer jobs: