Staff Software Engineer (AI Infrastructure)

DEEPREC.AI

• $130K — $180K *

Palo Alto, CA 94303In-Person

Enterprise Technology

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5 years of experience in software engineering focusing on systems infrastructure.
Deep knowledge of distributed systems and cloud-native infrastructure (AWS/GCP/Azure).
Proven expertise in building and optimizing APIs for large-scale AI model serving.
Familiarity with managing high-throughput GPU resources and scheduling.
Proficiency in backend programming languages like Python, Go, or C++.
Strong problem-solving skills and a proactive ownership mentality.
Exceptional communication and mentorship abilities.

Responsibilities

Design and maintain scalable GPU infrastructure for AI models.
Architect and optimize APIs for efficient AI model serving.
Lead orchestration and scheduling of GPU resources across clusters.
Support robust systems for model deployment and monitoring in production.
Collaborate with cross-functional teams to enhance AI-powered features.
Drive technical direction and conduct code reviews for the team.

Benefits

Opportunity to shape the technical vision for AI infrastructure.
Mentorship role to guide and develop junior engineers.
Direct contribution to user-facing generative AI experiences.
Cross-team collaboration with ML, backend, and platform engineering.
High-impact position within a rapidly growing and innovative company.

Full Job Description

Staff/Lead Software Engineer, AI Infrastructure

About the Role

This is a critical hire to build and scale the infrastructure behind the company's AI capabilities. You'll lead the design and implementation of GPU infrastructure, AI model serving APIs, and general AI infrastructure execution, enabling the machine learning features that drive the product.

You'll architect robust, distributed systems optimized for high-performance AI workloads, large-scale GPU orchestration, and low-latency, reliable API serving. Your work will directly shape how users experience generative AI at scale. As a senior technical leader, you'll also mentor engineers, drive best practices, and set the technical vision for AI infrastructure.

What You'll Do

Design, develop, and maintain scalable GPU infrastructure for training and serving state-of-the-art AI models.
Architect and optimize high-throughput, low-latency APIs for AI model serving and inference.
Lead the orchestration, scheduling, and efficient utilization of heterogeneous GPU resources across clusters.
Build and support robust systems for model deployment, monitoring, scaling, and reliability in production.
Collaborate with ML, backend, and platform engineering teams to deliver seamless AI-powered product features.
Drive technical direction, code reviews, and mentorship across the AI Infrastructure team.

What We're Looking For

5 years as a software engineer working on systems infrastructure, including hands-on ML serving and GPU orchestration.
Deep knowledge of distributed systems, Kubernetes (or similar orchestration frameworks), and cloud-native infrastructure (AWS/GCP/Azure).
Proven expertise building and optimizing APIs for large-scale AI model serving (TensorFlow Serving, Triton, TorchServe, or similar).
Familiarity with the challenges of high-throughput, scalable GPU fleet management, scheduling, and efficient model execution.
Proficiency in backend languages such as Python, Go, or C , with experience optimizing for performance and reliability.
Ownership mentality and the drive to solve complex problems independently in ambiguous, high-growth environments.
Excellent communication, collaboration, and mentorship skills.

Nice to Have

Experience with multi-modal AI model infrastructure (LLMs, generative models, video/image/speech models).
Background building infra for multi-tenant SaaS, enterprise AI/ML platforms, or operational automation at scale.
Previous startup experience, or a track record leading high-impact projects through ambiguity and rapid iteration.
Experience with competitive coding or large-scale distributed computing environments.

* Ladders Estimates

Similar Jobs

Senior Solutions Architect - AI/ML - Services Delivery
$130K — $180K *
Snowflake Computing
Remote
Today
Sr. AI Architect
$174K — $204K *
Cognizant
San Francisco, CA 94112 (San Francisco County)
Today
Software Engineer III - AI/ML Platform Operations - Remote
$105K — $140K *
CSAA Insurance Group
Remote
Yesterday
Gen AI Architect
$180K — $200K *
Qualitest Group
Santa Clara, CA 95051 (Santa Clara County)
Yesterday
Gen AI Architect
$180K — $200K *
Qualitest Group
Santa Clara, CA 95051 (Santa Clara County)
Yesterday
Associate Director, Enterprise AI Architect
$153K — $230K *
Jazz Pharmaceuticals
Remote
2 days ago

Get Ready For Your
Next Interview

More Jobs at DEEPREC.AI

Staff Software Engineer (AI Infrastructure)
$130K — $180K *
Palo Alto, CA 94303 (Santa Clara County)
Today
Enterprise Technology
In-Person
Senior ASR Engineer
$200K — $250K *
San Francisco, CA 94112 (San Francisco County)
2 weeks ago
Healthcare
In-Person
MLE SpeechLLM Evaluations
$250K — $350K *
San Francisco, CA 94112 (San Francisco County)
2 weeks ago
Consumer Technology
In-Person

More Enterprise Technology Jobs

Program Manager, Cloud & SaaS Innovation
$105K — $150K *
Emerson Group
Shakopee, MN 55379 (Scott County)
Today
Director - Technologies
$150K — $200K *
American Express
Palo Alto, CA 94301 (Santa Clara County)
Today
System Engineer - Solution Integrations
$90K — $130K *
NOV, Inc.
Houston, TX 77084 (Harris County)
Today
Senior PMT External Services, CloudFront Product Management
$130K — $180K *
Amazon
Seattle, WA 98115 (King County)
Today
Senior Manager - Customer Success
$90K — $120K *
Maya HTT
Montreal, QC H1A 0A1
Today

Find similar Staff Software Engineer (AI Infrastructure) jobs:

Nationwide Palo Alto, CA

Staff Software Engineer (AI Infrastructure)

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Staff Software Engineer (AI Infrastructure) jobs:

Get Ready For Your
Next Interview