Sr. Staff Software Engineer, Systems Infrastructure

• $198K — $326K *

Mountain View, CA 94040In-Person

Information Technology

8 - 10 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

BA/BS degree in Computer Science or related field, or equivalent experience
8+ years of software engineering experience, including distributed systems and infrastructure
Experience optimizing large-scale production ML systems or AI infrastructure
Familiarity with GPU systems, CUDA, and kernel optimization
Experience with large-scale inference systems and cost optimization
Proficiency in deep learning frameworks like PyTorch or TensorFlow
Knowledge of programming in C++, Go, Python, or Java

Responsibilities

Lead design and optimization of large-scale LLM serving infrastructure
Drive performance improvements in AI inference systems
Build and scale inference systems for LLMs and AI models
Optimize model execution across architecture, runtime, compiler, and hardware
Implement model optimization techniques including quantization and compression
Enhance GPU efficiency through low-level system work
Collaborate with ML and infrastructure teams to improve model performance
Contribute to open-source LLM serving frameworks
Set technical direction for AI infrastructure design
Mentor engineers and influence technical strategy

Benefits

Flexible hybrid work arrangement
Opportunity to work with cutting-edge AI infrastructure
Contributions to leading open-source technologies
Collaborative work environment with ML and product teams
Potential for annual performance bonuses and stock options

Full Job Description

Job Description

This role will be based in Sunnyvale or Mountain View, CA.

At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team.

LinkedIn's AI Infrastructure organization is responsible for building the foundational platforms that power AI across LinkedIn. The LLM Serving team builds the critical infrastructure that enables efficient, reliable, and large-scale deployment of large language models and other advanced AI models in production.

This team sits at the center of LinkedIn's AI platform, owning the layer between model training and production serving. The work focuses on making large-scale models run faster, cheaper, and more efficiently on GPUs at LinkedIn scale. The team builds and extends high-performance serving infrastructure and contributes to leading open-source technologies such as SGLang, vLLM, and related model serving frameworks.

We are looking for a Senior Staff Software Engineer with deep expertise at the intersection of systems, machine learning, GPU infrastructure, and large-scale inference. This is a highly technical, high-leverage role for someone who enjoys going deep into how models interact with runtimes, compilers, and hardware, and who wants to drive meaningful improvements in performance, cost, latency, and scalability across LinkedIn's AI systems.

Responsibilities

Lead the design, development, and optimization of LinkedIn's large-scale LLM serving infrastructure
Drive performance improvements across AI inference systems, including latency, throughput, GPU utilization, and cost efficiency
Build and scale online and offline inference systems for LLMs and other AI models
Optimize model execution across the full stack, including model architecture, runtime, compiler, kernel, and hardware layers
Drive model optimization techniques such as quantization, pruning, compression, batching, and memory optimization
Improve GPU efficiency through low-level systems work, including kernel-level optimization, runtime tuning, and hardware-aware performance improvements
Partner closely with ML, infrastructure, and product teams to identify serving bottlenecks and improve end-to-end model performance
Contribute to and/or extend open-source LLM serving frameworks such as SGLang, vLLM, Triton, or similar technologies
Set technical direction for model serving, inference performance, and next-generation AI infrastructure design
Mentor engineers and influence technical strategy across AI Infrastructure

Qualifications

Basic Qualifications

BA/BS degree in Computer Science or related technical field, or equivalent practical experience
8+ years of experience in software engineering, distributed systems, infrastructure, or machine learning systems
Experience building or optimizing large-scale production ML systems, model serving platforms, or AI infrastructure
Experience with GPU-based systems, CUDA, kernel optimization, or hardware-aware performance tuning
Experience with large-scale inference systems, including latency, throughput, reliability, and cost optimization
Experience with deep learning frameworks such as PyTorch, TensorFlow, or similar
Experience programming in one or more systems languages such as C++, Go, Python, or Java

Preferred Qualifications

Deep experience with LLM serving infrastructure, AI inference platforms, or large-scale model deployment systems
Familiarity with or contributions to open-source serving frameworks such as vLLM, SGLang, Triton, TensorRT, Ray, or similar technologies
Experience with ML compilers, runtimes, or graph optimization frameworks such as XLA, TVM, TensorRT, Triton, or similar
An understanding of model optimization techniques such as quantization, pruning, compression, batching, caching, and memory optimization
Experience improving GPU utilization and cost/performance efficiency for large-scale ML workloads
Experience building high-performance online or offline inference pipelines
An understanding of distributed systems, scheduling, resource management, and large-scale infrastructure operations
Experience operating across the stack from model-level optimization to runtime, compiler, kernel, and hardware-level performance improvements
Experience influencing technical direction across teams and partnering effectively with ML researchers, infrastructure engineers, and product teams

Suggested Skills

AI/ML Systems and Infrastructure
GPU and Performance Optimization
Model Serving and Inference Systems
Distributed Systems
Technical Leadership

LinkedIn is committed to fair and equitable compensation practices.

The pay range for this role is $198,000 to $326,000. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location. This may be different in other locations due to differences in the cost of labor.

The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans. For more information, visit https://careers.linkedin.com/benefits.

Additional Information

* Ladders Estimates

Similar Jobs