Senior Site Reliability Engineer

Block, Inc

$189K — $283K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of software development experience
  • Strong incident management skills
  • Experience running production oncall for high-availability systems
  • Familiarity with AI-driven tooling for observability
  • Fluency with CI/CD pipelines and rollback automation
  • Monitoring and observability expertise using various technologies
  • Comfort with vendor management and escalation contacts

Responsibilities

  • Build and extend platforms to improve system reliability
  • Drive platform-wide reliability improvements
  • Standardize reliability tools across multiple platforms
  • Lead stabilization of sev 0-1 incidents
  • Serve as primary oncall for critical services
  • Use AI to enhance observability and incident analysis
  • Design safe deployment patterns for system changes

Benefits

  • Comprehensive healthcare coverage (Medical, Vision, Dental)
  • Retirement plans with company match
  • Employee Stock Purchase Program
  • Wellness programs including mental health support
  • Paid parental and caregiving leave
  • Flexible time off including 12 paid holidays
  • Learning and development resources
  • Life, AD&D, and disability insurance
  • Health Savings Account and Flexible Spending Account
Full Job Description
The Role

As a member of the SRE team, you will proactively and reactively improve the reliability of Block's platform and critical infrastructure. You are metrics-driven, systems-oriented, and focused on building distributed platforms that enable safe, scalable product development.

You will leverage and continuously improve AI-driven tooling and automation to enhance observability, accelerate incident detection and response, and reduce operational toil. This includes applying AI to incident analysis, alert tuning, and operational workflows.

You will participate in primary platform oncall (12 hours per day, one week every few weeks, depending on team size), supporting Block's most critical (Tier 0) services. In this role, you will lead incident command, coordinate mitigation, and drive effective escalation during high-severity events.

You Will
  • Build and extend platforms to improve system reliability
  • Work on team goals that encompass reliability for the entire company
  • Standardize reliability tools across multiple platforms and organizations
  • Triage, coordinate, and lead stabilization of sev 0-1 incidents
  • Serve as primary oncall, maintaining structured escalation paths and exercising leadership escalation
  • Drive platform-wide reliability improvements, shared operational tooling, and deploy-safety patterns
  • Use AI-driven systems to improve signal detection, reduce noise, and accelerate root cause analysis
  • Design and implement safe deployment patterns (progressive delivery, automated rollback, guardrails)

You Have
  • Drive to root cause systems with many moving parts and take the necessary steps to fix them
  • Demonstrated technical initiative and leadership on previous projects, especially those with a backend/platform focus
  • Familiarity with AI-driven tooling for observability, incident analysis, or automation
  • A mindset that naturally reaches for AI to accelerate problem-solving and reduce toil
  • Experience running production oncall for high-availability systems
  • Strong incident management skills - structured triage, mitigation under pressure, blameless postmortems
  • Fluency with CI/CD pipelines, progressive rollout strategies, and rollback automation
  • Monitoring & observability expertise - building/tuning alerts for uptime, error rates, latency regression, and resource exhaustion
  • Ability to create and maintain evidence-based maturity assessments using trailing 90-day data windows.
  • Comfort with vendor/dependency management - maintaining validated escalation contacts reachable within ≤ 5 minutes.
  • Boundless curiosity, autonomy, and a strong sense of accountability
  • A strong desire to perform and grow as an engineer
  • 5+ years of software development experience

Technologies We Use and Teach
  • Kotlin, Modern Java (11+)
  • HTTP, JSON, gRPC, and Protocol Buffers
  • MySQL / Vitess / DynamoDB
  • Event driven architectures
  • DataDog
  • LaunchDarkly
  • Terraform, Kubernetes, Istio/Envoy
  • Amazon Web Services

This program shifts Block from reactive incident handling to repeatable, system-wide reliability gains - fewer customer-visible incidents, faster response, higher product velocity, and lower burnout across the organization.

Full-time employee benefits include the following:
  • Healthcare coverage (Medical, Vision and Dental insurance)
  • Health Savings Account and Flexible Spending Account
  • Retirement Plans including company match
  • Employee Stock Purchase Program
  • Wellness programs, including access to mental health, 1:1 financial planners, and a monthly wellness allowance
  • Paid parental and caregiving leave
  • Paid time off (including 12 paid holidays)
  • Paid sick leave (1 hour per 26 hours worked (max 80 hours per calendar year to the extent legally permissible) for non-exempt employees and covered by our Flexible Time Off policy for exempt employees)
  • Learning and Development resources
  • Paid Life insurance, AD&D, and disability benefits

These benefits are further detailed in Block's policies. This role is also eligible to participate in Block's equity plan subject to the terms of the applicable plans and policies, and may be eligible for a sign-on bonus. Sales roles may be eligible to participate in a commission plan subject to the terms of the applicable plans and policies. Pay and benefits are subject to change at any time, consistent with the terms of any applicable compensation or benefit plans.

Block takes a market-based approach to pay, and pay may vary depending on your location. U.S. locations are categorized into one of four zones based on a cost of labor index for that geographic area. The successful candidate's starting pay will be determined based on job-related skills, experience, qualifications, work location, and market conditions. These ranges may be modified in the future.

Zone A: USD $189,000 - USD $283,600
Zone B: USD $179,600 - USD $269,400
Zone C: USD $170,100 - USD $255,100
Zone D: USD $160,700 - USD $241,100

Application Guidelines

Candidates may submit up to 9 active applications within a 60-day period. Reapplications to the same role are accepted 90 days after a previous application has been reviewed.

Use of AI in Our Hiring Process

We may use automated AI tools to evaluate job applications for efficiency and consistency. These tools comply with local regulations, including bias audits, and we handle all personal data in accordance with state and local privacy laws.

Contact us here with hiring practice or data usage questions.

Every benefit we offer is designed with one goal: empowering you to do the best work of your career while building the life you want. Remote work, medical insurance, flexible time off, retirement savings plans, and modern family planning are just some of our offering. Check out our other benefits at Block.

Similar Jobs

More Jobs at Block, Inc

More Information Technology Jobs

Find similar Senior Site Reliability Engineer jobs: