MLOps Engineer

Stanford Health Care

• $165K — $218K *

Newark, CA 94560Hybrid

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's or higher degree in Computer Science, Engineering or a related field.
Three or more years of directly related experience as an MLOps Engineer.
Strong knowledge of cloud platforms (AWS, Azure, Google Cloud) and infrastructure-as-code tools (Terraform, CloudFormation).
Proficiency in Docker and Kubernetes for container orchestration.
Solid programming skills in Python, Rust or Go, with experience in scripting and automation.
Familiarity with machine learning frameworks (PyTorch, TensorFlow, scikit-learn).
Deep understanding of DevOps principles, agile methodologies, and software development lifecycle.

Responsibilities

Design, build and maintain scalable infrastructure for AI/ML systems.
Develop and implement CI/CD pipelines for AI/ML models and applications.
Collaborate with cross-disciplinary teams to optimize model training and deployment.
Monitor and troubleshoot AI/ML systems for performance and reliability.
Maintain training and inference pipelines across multi-cloud environments.
Manage Kubernetes pods and model registries.
Implement security best practices in AI/ML workflows.

Benefits

Mentorship and technical guidance opportunities for junior team members.
Access to cutting-edge AI and ML technologies in healthcare.
Collaboration with leaders in clinical specialties and AI research.
Opportunities to shape the infrastructure for innovative healthcare solutions.

Full Job Description

Day - 08 Hour (United States of America)

We are seeking a high-caliber Senior AI Platform & ML Ops Engineer to architect the "layered" infrastructure required for autonomous, agentic systems within Stanford Healthcare. In this role, you will be the "Master Chef" of our AI ecosystem, seamlessly folding Expert-Level DevOps (Kubernetes, Terraform, DevOps orchestration) with Agentic Application Development (LangGraph, CrewAI, Tool-calling logic). You won't just manage servers; you will build the robust, full-stack "factory" where multi-agent frameworks interact with healthcare APIs, ensuring every autonomous action is governed by strict ML Ops observability (LangSmith, Arize) and safety guardrails. If you have the "crispy" coding skills to build RAG pipelines in Python and the "rich" architectural depth to deploy scalable microservices, extensive full stack software development expertise, we want you to lead the integration of reasoning-based AI into the future of clinical and business workflow automations.

This is a Stanford Health Care job.

A Brief Overview
The MLOPs Engineer will play an integral role incorporating Artificial Intelligence (AI) within Stanford Health Care. The solutions will impact patient care, medical research, and operational services. This group is tasked to innovate, build, deploy and monitor production grade AI, machine learning (ML) and predictive algorithms into healthcare. The role will partner closely with lead researchers within the AI field and leaders across various clinical specialties and operations.

This role will report to the Infrastructure group and have a dotted line relationship to the Data Science team. The role will be responsible for maintaining cloud-based infrastructure as code repositories, maintaining infrastructure, deployment pipelines and designing the security landscape for the team and objects. The role will set the standards for the full SDLC of projects for the Data Science team.

Locations
Stanford Health Care

What you will do

Design, build and maintain scalable and robust infrastructure for AI/ML systems, including cloud-based environments, containerization and orchestration platforms.
Develop and implement CI/CD pipelines to automate the deployment, testing and monitoring of AI/ML models and applications.
Collaborate with data scientists, data engineers and software engineers to optimize model training, deployment and inference pipelines.
Monitor and troubleshoot AI/ML systems to ensure high availability, performance and reliability.
Maintain and monitor model training and inference pipelines across multi-cloud tenants especially around Large Language Models (LLMs).
Maintain Kubernetes pods, container registry and virtual machine image library and model registry
Monitor infrastructure utilization and costs pertaining to model training, inference and GPU utilization
Implement best practices for security, data privacy and compliance in AI/ML workflows and infrastructure.
Evaluate and integrate new tools, technologies and frameworks to improve the efficiency and effectiveness of our MLOps processes.
Mentor and provide technical guidance to junior members of the organization.
Stay up-to-date with the latest advancements and trends in MLOps, DevOps and cloud technologies and share them with the team.

Education Qualifications

Bachelor's or higher degree in Computer Science, Engineering or a related field

Experience Qualifications

Three (3) or more years of directly related experience

Required Knowledge, Skills and Abilities

Proven experience as an MLOps Engineer.
Strong knowledge of cloud platforms such as AWS, Azure or Google Cloud and experience with infrastructure-as-code tools like Terraform or CloudFormation.
Proficiency in containerization technologies such as Docker and container orchestration platforms like Kubernetes.
Experience with CI/CD tools such as GitLab CI/CD, Github Actions or CiricleCI.
Solid programming skills in languages such as Python, Rust or Go and experience in scripting and automation.
Familiarity with machine learning frameworks and libraries such as PyTorch, Tensorflow and scikit-learn.
Deep understanding of DevOps principles, agile methodologies and software development lifecycle.
Strong problem-solving and trouble shooting skills, with the ability to analyze and resolve complex technical issues.
Excellent communication and collaboration skills with the ability to work effectively in cross-functional teams.

Physical Demands and Work Conditions
Blood Borne Pathogens

Category III - Tasks that involve NO exposure to blood, body fluids or tissues, and Category I tasks that are not a condition of employment

These principles apply to ALL employees:

SHC Commitment to Providing an Exceptional Patient & Family Experience

Stanford Health Care sets a high standard for delivering value and an exceptional experience for our patients and families. Candidates for employment and existing employees must adopt and execute C-I-CARE standards for all of patients, families and towards each other. C-I-CARE is the foundation of Stanford's patient-experience and represents a framework for patient-centered interactions. Simply put, we do what it takes to enable and empower patients and families to focus on health, healing and recovery.

You will do this by executing against our three experience pillars, from the patient and family's perspective:

Know Me: Anticipate my needs and status to deliver effective care
Show Me the Way: Guide and prompt my actions to arrive at better outcomes and better health
Coordinate for Me: Own the complexity of my care through coordination

Base Pay Scale: Generally starting at $79.21 - $104.97 per hour

The salary of the finalist selected for this role will be set based on a variety of factors, including but not limited to, internal equity, experience, education, specialty and training. This pay scale is not a promise of a particular wage.

* Ladders Estimates

Similar Jobs

Senior AI Engineer
$139K — $229K *
LinkedIn
Sunnyvale, CA 94087 (Santa Clara County)
Today
Senior AI/ML Engineer: Python & Scientific Computing
$175K — $250K *
Swayable
San Francisco, CA 94112 (San Francisco County)
6 days ago
Senior AI/ML Engineer
$172K — $238K *
VXI Global Solutions
San Francisco, CA 94112 (San Francisco County)
1 week ago
Sr. Startup Solution Architect, GenAI , San Francisco GenAI Startups
$176K — $239K *
Amazon
San Francisco, CA 94112 (San Francisco County)
Reposted 1 week ago
Senior AI/ML Engineer - Future Sensing, Embodied AI
$182K — $250K *
General Motors
Sunnyvale, CA 94087 (Santa Clara County)
1 week ago
Senior AI/ML Engineer - Future Sensing, Embodied AI
$182K — $250K *
General Motors
Remote
1 week ago

Get Ready For Your
Next Interview

More Jobs at Stanford Health Care

MLOps Engineer
$165K — $218K *
Newark, CA 94560 (Alameda County)
Today
Information Technology
Hybrid
Senior Director of Clinical Nutrition Services
$195K — $260K *
Palo Alto, CA 94303 (Santa Clara County)
Today
Healthcare
In-Person
Executive Director of University Medical Group (UMG)
$207K — $275K *
Newark, NJ 07104 (Essex County)
Today
Healthcare
Hybrid
Oncology Research Nursing Professional (RN), South Bay Cancer Center
$174K — $232K *
San Jose, CA 95123 (Santa Clara County)
Today
Healthcare
Hybrid
Executive Assistant I
$95K — $124K *
Palo Alto, CA 94303 (Santa Clara County)
Today
Healthcare
Hybrid

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
Yesterday
Client Partner - Banking / Financial Services / Capital Markets
$325K — $350K + $100K bonus *
Large IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
1 week ago
Senior Reliability Engineer
$160K — $190K *
Stream Data Centers
Dallas, TX 75217 (Dallas County)
Today
Director, AI Engineering
$130K — $180K *
Royal Bank of Canada
Toronto, ON M3C 0E3
Reposted Today
INFORMATION TECHNOLOGY SPECIALIST
$75K — $95K *
U.S. Marine Corps
Quantico, VA 22134 (Prince William County)
Today

Find similar MLOps Engineer jobs:

Nationwide Newark, CA

MLOps Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar MLOps Engineer jobs:

Get Ready For Your
Next Interview