Senior Site Reliability Engineer

Lantern

• $120K — $150K *

Dallas, TX 75217Hybrid

Healthcare

Less than 5 years of experience

Reposted Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree in Computer Science, IT, Engineering, or related field, or equivalent experience
4+ years in SRE, DevOps, or production operations
3+ years of experience with Microsoft Azure
Strong experience with observability tools (e.g., Datadog, Azure Monitor)
Proven incident management experience with tools like Rootly
Hands-on experience with Infrastructure as Code (Terraform)
Scripting skills in Python, Bash, or PowerShell

Responsibilities

Define and track SLOs/SLIs/error budgets for healthcare services
Build and maintain observability platforms using Datadog and Azure Monitor
Lead incident management processes, including on-call rotations and post-incident reviews
Automate operational tasks through Infrastructure-as-Code and custom tooling
Design disaster recovery and business continuity strategies
Collaborate with development teams to enhance service reliability
Optimize performance, capacity planning, and cost efficiency for Azure infrastructure

Benefits

Medical Insurance
Dental Insurance
Vision Insurance
Short & Long Term Disability
Life Insurance
401k with company match
Flexible Time Off
Paid Parental Leave

Full Job Description

Lantern is seeking an experienced Senior Site Reliability Engineer to champion the reliability, availability, and performance of our Azure-based healthcare platform. In this pivotal role, you will define and implement SRE practices, drive incident management processes, build observability frameworks, and ensure our systems meet stringent uptime and compliance requirements. You will collaborate with platform engineers, application developers, and security teams to embed reliability into every layer of our infrastructure. This role is ideal for an SRE expert with deep experience in production operations, monitoring, incident response, and automation in cloud environments.

You will work on the Platform Engineering team, partnering with application developers, infrastructure engineers, and security teams to establish SRE best practices across Lantern. Your focus will be on building resilience, reducing toil through automation, and creating a culture of reliability that ensures our healthcare platform delivers consistent, high-quality service to our users.

Location: Hybrid - at least 3 days/wk in our Dallas, TX offices

On-Call: This position requires being on-call 1 week per month

Responsibilities:

Define and track SLOs/SLIs/error budgets for critical healthcare services
Build and maintain observability platforms (monitoring, logging, alerting, tracing) using Datadog and Azure Monitor
Lead incident management processes using Rootly, including on-call rotations, runbooks, and post-incident reviews
Automate operational toil through Infrastructure-as-Code (Terraform) and custom tooling
Design and implement disaster recovery and business continuity strategies
Collaborate with development teams to improve service reliability through architecture reviews and chaos engineering
Optimize system performance, capacity planning, and cost efficiency for Azure infrastructure
Ensure production systems meet HIPAA, SOC 2, and other regulatory requirements
Maintain and improve CI/CD pipelines to support safe, rapid deployments
Mentor junior engineers and foster a culture of reliability and operational excellence

Requirements:

Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field, or equivalent practical experience.
4+ years in SRE, DevOps, or production operations roles
3+ years with Microsoft Azure (AWS/GCP a plus)
Strong experience with observability tools (Datadog, Azure Monitor, Prometheus, Grafana, or similar)
Experience defining and managing SLOs/SLIs and error budgets
Proven incident management and on-call experience (Rootly or similar incident management platforms)
Hands-on with Infrastructure as Code (Terraform) and CI/CD (Azure DevOps, GitHub Actions)
Experience in regulated environments (healthcare/HIPAA preferred)
Strong scripting skills (Python, Bash, PowerShell)
Excellent communication and collaboration skills
If you don't meet every requirement listed, we still encourage you to apply.

Strong Candidates Will:

Deep experience with chaos engineering and reliability testing
Experience with Azure Kubernetes Service and containerized workloads
Relevant certifications (Azure, SRE, Kubernetes)

Benefits

Medical Insurance
Dental Insurance
Vision Insurance
Short & Long Term Disability
Life Insurance
401k with company match
Flexible Time Off
Paid Parental Leave

* Ladders Estimates

Similar Jobs

Senior Site Reliability Engineer - Database Services
$110K — $140K *
Toyota
Plano, TX 75025 (Collin County)
4 days ago
Customer Reliability Engineer - Infrastructure
$125K — $130K *
Astronomer
Austin, TX 78745 (Travis County)
5 days ago
Application Engineer - Power Platform Developer
$90K — $130K *
ASM Research
Remote
Reposted 6 days ago
Senior Site Reliability Engineer
$120K — $150K *
2k Games
Austin, TX 78745 (Travis County)
1 week ago
Sr. Infrastructure Site Reliability Engineer
$120K — $150K *
Charles Schwab
Austin, TX 78745 (Travis County)
1 week ago
Lead Site Reliability Engineer (SRE)
$135K — $175K *
Maximus
Jbsa Randolph, TX 78150 (Bexar County)
1 week ago

Get Ready For Your
Next Interview

More Jobs at Lantern

Oncology Nurse Navigator
$75K — $95K *
Dallas, TX 75217 (Dallas County)
3 days ago
Healthcare
Hybrid
GitHub Delivery Manager
$90K — $120K *
Edmonton, AB T5A 0A1
4 days ago
Technical Services
In-Person
GitHub Delivery Manager
$100K — $130K *
Dallas, TX 75217 (Dallas County)
4 days ago
Technical Services
In-Person
Senior Analyst, Engagement Analytics
$100K — $115K *
New York, NY 10025 (New York County)
4 days ago
Healthcare
In-Person
Senior Analyst, Engagement Analytics
$100K — $115K *
Dallas, TX 75217 (Dallas County)
4 days ago
Healthcare
Hybrid

More Healthcare Jobs

Optometrist O.D.
$150K — $200K *
Eyediology Vision Care
Las Vegas, NV 89147 (Clark County)
1 week ago
Certified Case Management for Trauma Manager
Confidential Company
Houston, TX 77096 (Harris County)
1 week ago
Market Manager - Alabama
$80K — $110K *
Monogram Health
Remote
Reposted Today
Vice President Revenue Operations
$206K — $309K *
Wolters Kluwer
Indianapolis, IN 46227 (Marion County)
Today
Regional Travel, Clinical Research Coordinator
$75K — $105K *
Care Access
Lima, OH 45801 (Allen County)
Today

Find similar Senior Site Reliability Engineer jobs:

Nationwide Dallas, TX

Senior Site Reliability Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior Site Reliability Engineer jobs:

Get Ready For Your
Next Interview