Sanity

Senior Site Reliability Engineer

Sanity$120K — $160K *
US-AnywhereRemote in United States
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of experience in an SRE on-call rotation
  • Proficient with SRE/DevOps tools, processes, and culture
  • Strong analytical skills for infrastructure design and optimization
  • Experience managing scalable, cloud-based applications
  • Familiarity with Kubernetes for orchestration and management of containerized apps
  • Expertise in building CI/CD pipelines
  • Knowledgeable with observability stacks like Prometheus
  • Experience with CDNs, edge, gateways, and caching layers

Responsibilities

  • Design and operate foundational platform infrastructure for engineers
  • Diagnose and troubleshoot complex distributed systems
  • Ensure platform observability and behavior analysis
  • Contribute to modernizing edge and caching layers
  • Improve reliability through better monitoring and incident response
  • Create automation for efficient deployment and production readiness
  • Mentor engineers through code and design reviews
  • Participate in on-call rotation to support developers

Benefits

  • Supportive and skilled team environment
  • Real-world infrastructure scale and impactful work
  • Flexible and trust-based workplace culture
  • Diverse global team and customer base
  • Comprehensive health plans and perks
  • Good work-life balance
  • Competitive stock options and salary
Full Job Description
What you would do:
  • Design, build, and operate the shared platform foundations engineers ship on every day: GCP infrastructure, Kubernetes, networking, routing, CI/CD, and observability.
  • Diagnose and troubleshoot complex distributed systems running at high request volume.
  • Ensure observability and analyze the behavior of our stack.
  • Contribute to in-flight work like modernizing our edge, caching, and gateway layers onto Fastly and tightening observability across the platform.
  • Raise the reliability bar through better dashboards, alert severity, paging standards, on-call readiness, and incident response.
  • Make deployment boring in the best way: build golden paths, production readiness checks, safe rollouts, and useful automation so engineers have fewer places to look before they ship.
  • Mentor engineers and raise the technical bar through code review, design review, and pairing.
  • Participate in our on-call rotation and help our developer on-call rollout land well.


About you:
  • Based in the United States, with reasonable overlap with European engineering hours.
  • Experience with SRE/DevOps tools, processes, and culture.
  • 5+ years of experience as part of an SRE on-call rotation.
  • Analytical approach to designing, diagnosing, and optimizing infrastructure.
  • Experience with managing scalable, highly available, cloud-based applications, ideally with high request volume and customer-facing uptime expectations.
  • Experience with Kubernetes for orchestrating, scaling, and managing containerized applications in cloud-based environments.
  • Experience building CI/CD pipelines.
  • Experience with an observability stack (Prometheus, et al.).
  • Comfortable working across CDNs, edge, gateways, and caching layers, or eager to go deep there.
  • You improve on-call and reliability by building systems, standards, and feedback loops that make production healthier over time.
  • You are comfortable dealing with incidents and outages and have built a practical, thoughtful communication style for handling high-pressure situations.
  • An open but considered approach to new technologies.


There are many roads leading up to being an SRE. Our team is already a mix of self-taught and formally educated people. Don't self-select out!

What we can offer:
  • A highly-skilled, inspiring, and supportive team
  • Real infrastructure scale and meaningful, hands-on work changing how it runs
  • Positive, flexible, and trust-based work environment that encourages long-term professional and personal growth
  • A global, multi-culturally diverse group of colleagues and customers
  • Comprehensive health plans and perks
  • A healthy work-life balance that accommodates individual and family needs
  • Competitive stock options program and location-based salary

Similar Jobs

More Jobs at Sanity

More Information Technology Jobs

Find similar Senior Site Reliability Engineer jobs: