Staff, Site Reliability Engineer (SRE)

Sprinter Health

$130K — $180K *
Healthcare
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years of experience in site reliability, platform, or security engineering roles
  • Proven ability to lead infrastructure and security projects with minimal oversight
  • Expertise in production systems within cloud environments, especially AWS and GCP
  • Strong experience with infrastructure as code, preferably using Terraform
  • Demonstrated improvement of observability and incident response processes
  • Skilled in automating workflows using scripting languages like Python and Bash
  • Familiarity with cloud security practices and troubleshooting production issues

Responsibilities

  • Design and optimize the infrastructure supporting healthcare logistics and patient care
  • Enhance reliability in cloud infrastructure and incident response practices
  • Establish and maintain a secure cloud environment and operational workflows
  • Develop and implement infrastructure as code with Terraform and similar tools
  • Automate operational processes to reduce manual workload
  • Collaborate with engineering teams on architecture and monitoring improvements
  • Guide troubleshooting efforts across application and infrastructure boundaries

Benefits

  • Meaningful pre-IPO equity
  • 100% medical, dental, and vision coverage for you and dependents
  • Flexible PTO plus 10 paid holidays annually
  • 401(k) plan with employer match
  • 16-week parental leave for birthing parents, 8 weeks for others
  • Contributions for HSA and FSA
  • Life insurance and disability coverage
  • Complimentary daily lunch in the office
  • Annual stipend for professional development
  • Relocation assistance if needed
Full Job Description
About the Role

We're looking for a Staff Site Reliability Engineer who wants to build the reliability, infrastructure, and security foundations that power last-mile healthcare delivery at scale.

At Sprinter, you'll work on the operational backbone behind products that blend logistics, patient experience, safety, and medical operations. Our systems help determine whether patients get access to care, whether clinicians are routed efficiently, whether internal teams can operate effectively, and whether our platform can scale securely and reliably as the business grows.

This role is ideal for someone who wants broad ownership across reliability, cloud infrastructure, security, observability, automation, and platform design. You'll help raise the operational bar across engineering, reduce toil through infrastructure as code and scripting, strengthen our security posture, and guide architectural decisions that make our systems more resilient over time.

If you want to make meaningful technical decisions, work across engineering and operations, and help shape the foundation of how a high-growth healthcare company scales, this is that role.

Office Location

We are a hybrid company based in the Bay Area with offices in both San Francisco and Menlo Park. For this requisition, we are open to remote candidates but will prioritize candidates who are local. We care about work-life balance and understand that there will be times where flexibility is needed.
What you will do
  • Design, build, and improve the infrastructure that powers Sprinter's patient care, clinician operations, internal tooling, and partner-facing systems
  • Improve reliability across distributed systems, cloud infrastructure, CI/CD, observability, and incident response
  • Raise the security baseline across cloud infrastructure, access controls, secrets management, identity, and operational workflows
  • Build and maintain infrastructure as code using Terraform and related tooling
  • Automate manual infrastructure and operational processes through scripting, tooling, and platform improvements
  • Partner with engineering teams to improve system architecture, deployment practices, monitoring, logging, and alerting
  • Troubleshoot complex issues across infrastructure, application, data, and operational boundaries
  • Help define reliability, security, and infrastructure standards that allow Sprinter to scale without creating brittle systems
  • Support incident response practices, postmortems, operational readiness, and continuous improvement across engineering
  • Make pragmatic tradeoffs between reliability, security, speed, and simplicity in a fast-moving startup environment
What you have done
  • Spent 8+ years in site reliability engineering, platform engineering, infrastructure engineering, security engineering, or related technical roles
  • Led high-impact infrastructure, reliability, platform, or security projects end to end with minimal oversight
  • Built and operated production systems in cloud environments, ideally AWS and/or GCP
  • Worked deeply with infrastructure as code, ideally Terraform
  • Improved observability, monitoring, logging, alerting, and incident response practices across engineering teams
  • Automated infrastructure, deployment, or operational workflows using scripting languages such as Python, Bash, or TypeScript
  • Improved cloud security, access management, secrets management, networking, or operational controls
  • Troubleshot production issues across application, infrastructure, networking, and deployment layers
  • Worked in environments where reliability, security, ambiguity, and speed all matter
  • Made technical decisions that balanced immediate business needs with long-term scalability, reliability, and maintainability
What gives you an edge
  • You've built or scaled infrastructure in health tech, logistics, marketplace, fintech, or other operationally complex environments
  • You've worked in mid- or growth-stage startups where speed, ambiguity, and pragmatic decision-making were required
  • You have experience improving security posture in a practical, engineering-friendly way
  • You've helped establish reliability standards, incident response practices, or platform patterns across an engineering org
  • You're comfortable working directly with product engineers, data teams, operations, security stakeholders, and technical leadership
  • You have experience mentoring engineers and raising the operational bar across a broader engineering team
  • You've worked in regulated environments and understand the importance of privacy, security, and compliance best practices
  • You have people management experience or interest in growing into broader technical leadership over time
The Interview Process

We aim to complete the interview process within 2-3 weeks. It will usually consist of:
  • Recruiter Screen: Background fit, motivation, and compensation alignment
  • Hiring Manager Interview: Experience and technical depth
  • Technical Interview: SRE fundamentals, observability, incident response, and disaster recovery
  • Soft Skills Interview: Collaboration style and compatibility with the teams this person will support
  • Reference Checks: Validation of performance and working style
What we offer
  • Meaningful pre-IPO equity
  • Medical, dental, and vision plans 100% paid for you and your dependents
  • Flexible PTO + 10 paid holidays per year
  • 401(k) with match
  • 16-week parental leave policy for birthing parent, 8 weeks for all other parents
  • HSA + FSA contributions
  • Life insurance, plus short and long-term disability coverage
  • Free daily lunch in-office
  • Annual learning stipend
  • Relocation assistance
Our Technology Stack
  • Terraform and infrastructure-as-code tooling
  • AWS
  • GCP
  • TypeScript
  • Python
  • Bash
  • CI/CD systems
  • Monitoring, logging, and observability platforms
  • Identity, access, and secrets management systems
  • Cloud networking and infrastructure tooling
  • Container and deployment systems
  • Serverless AWS, including AppSync, DynamoDB, Lambda, Amplify, CloudFormation, and Node
  • GraphQL
  • React Native and React Native for Web

Similar Jobs

More Jobs at Sprinter Health

More Healthcare Jobs

Find similar Staff, Site Reliability Engineer (SRE) jobs: