Software Engineer, ML Serving

Rime Labs

• $130K — $180K *

San Francisco, CA 94112In-Person

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years of hands-on experience with real-time multinode ML serving infrastructure.
Proficient in ML serving frameworks like NVIDIA Dynamo/Triton or equivalent.
Strong understanding of distributed model serving techniques.
Solid background in cloud infrastructure including Linux, Docker, and Kubernetes.
Experience with IaC tools like Terraform or Packer.

Responsibilities

Architect and implement TTS serving infrastructure supporting GPU-backed inference.
Optimize models for both single-node and distributed fleet serving.
Ensure compatibility with various NVIDIA hardware for cloud and on-prem solutions.
Develop CI/CD workflows for the model serving pipeline.
Maintain site reliability through monitoring and observability practices.
Manage resources and costs across the GPU fleet.

Benefits

Opportunity to build infrastructure for a leading voice AI company.
Collaborate directly with teams without a handoff culture.
Significant impact on customer experience through the systems you design.
Ownership and influence over company direction and technology vision.
Equity options offered at an early-stage company.
Flexible work environment with minimal bureaucracy.
Located in the SF / Bay Area, a hub for tech innovation.

Full Job Description

Role Overview

We're hiring a Software Engineer to own the serving infrastructure that connects Rime's inference engines to the world. This role sits at the intersection of ML systems and cloud infrastructure - you'll work directly on model inference and cloud infrastructure to build, harden, and scale the systems that stream voice at real-time latency. As Rime moves toward its next-generation architecture, you'll be a core architect of how our models get served.

What You'll Own

Architecture and implementation of Rime's TTS serving infrastructure, from GPU-backed inference engines to the API surface.
Model optimization from a single-node to disaggregated fleet serving.
Compatibility with different NVIDIA hardwares from Hopper to Blackwell and beyond for on-prem and cloud deployments.
Continuous integration and deployment workflows for the model serving pipeline.
Site reliability: on-call rotation, monitoring, alerting, and observability across the serving stack.
Resource provision, cost management across our GPU fleet.

What We're Looking For

Hands-on experience with real-time multinode ML serving infrastructure - ML serving framework experience: NVIDIA Dynamo/Triton, vLLM, SGLang, or equivalent.
Experience with distributed or disaggregated model serving (Tensor Parallel, Pipeline Parallel, or equivalent).
Strong cloud infrastructure fundamentals: Linux internals, networking, containerization (Docker, Kubernetes).
IaC experience - Terraform, Packer, or comparable. You should have opinions about how to do this right.
On-call is part of the job. You treat production reliability as a shared responsibility.

Nice to Have

Experience with multinode training (DDP, FSDP, etc.).
Experience with gRPC or other bidirectional binary streaming protocols.
Experience with audio streaming and related technologies (WebRTC, WebSockets, etc.).
Experience with a multilingual monorepo where you pick the best language out of merit more than personal experience.
Experience with multi-cloud infrastructures (AWS, GCP, OCI, etc.).
Comfort with configuration management tooling (Ansible, Chef, Puppet, or similar).
SRE, DevOps, or platform engineering background at a startup.
Experience at an early-stage company.

Why Join Rime

Build the serving infrastructure behind a category-defining voice AI company from the ground up.
You will bring in experience that no one else currently has at the company: you can help us set the vision.
Direct collaboration with the inference, platform, and ML teams - no handoff culture.
The systems you build determine what experiences our customers can deploy at scale.
Meaningful equity upside at an early stage.
High ownership, high standards, low bureaucracy.
SF / Bay Area.

* Ladders Estimates

Similar Jobs

Motion Planning Engineer
$100K — $140K *
Imagry
San Jose, CA 95123 (Santa Clara County)
Today
Salesforce Software Engineer
$90K — $130K *
Ascensus
Remote
Today
Software Development Engineer
$72K — $144K *
CVS Health
Remote
Today
Software Development Engineer
$79K — $158K *
CVS Health
Remote
Today
Software Engineer, Networking (Features)
$163K — $226K *
Tailscale
Remote
Today
Technical Artist - Expression of Interest - ILM San Francisco
$107K — $136K *
The Walt Disney Company
San Francisco, CA 94112 (San Francisco County)
Reposted Today

Get Ready For Your
Next Interview

More Jobs at Rime Labs

Technical Project Manager
$120K — $150K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Software Engineer, ML Serving
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Forward Deployed Linguist
$90K — $130K *
San Francisco, CA 94112 (San Francisco County)
1 week ago
Consumer Technology
In-Person
Fullstack Platform Engineer
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
1 week ago
Technical Services
In-Person
VP of Engineering
$180K — $220K *
San Francisco, CA 94112 (San Francisco County)
3 weeks ago
Enterprise Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
2 days ago
Client Partner - Banking / Financial Services / Capital Markets
$325K — $350K + $100K bonus *
Large IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
2 weeks ago
Director, Software Engineering
$129K — $240K *
Starcom Mediavest Group Germany Gmbh
Irving, TX 75061 (Dallas County)
Today
Agentic AI Technology Lead
$120K — $150K *
Techstra Solutions
Pittsburgh, PA 15237 (Allegheny County)
Today
Staff Software Engineer, Data Engineering Solutions
$130K — $180K *
Stripe
Seattle, WA 98115 (King County)
Today

Find similar Software Engineer, ML Serving jobs:

Nationwide San Francisco, CA

Software Engineer, ML Serving

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Software Engineer, ML Serving jobs:

Get Ready For Your
Next Interview