Mozilla

Senior Site Reliability Engineer

Mozilla$108K — $125K *
US-AnywhereRemote in Canada
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years in infrastructure, platform engineering, or site reliability roles with production Kubernetes experience
  • Hands-on experience with AWS infrastructure-as-code using Terraform, OpenTofu, or Pulumi
  • Security awareness including identity, least privilege, and secrets hygiene
  • Strong ownership mindset with proactive issue identification and completion
  • Excellent async written communication skills for distributed teamwork
  • Ability to collaborate with technical and non-technical stakeholders to enhance platform reliability
  • Willingness to learn and use emerging technologies responsibly

Responsibilities

  • Operate and evolve an EKS-based Kubernetes platform for service migrations and reliability
  • Design and develop CI/CD systems for Thunderbird websites and services
  • Write and maintain infrastructure using Pulumi or Terraform/OpenTofu across AWS accounts
  • Evolve the observability stack and work with engineering teams for monitoring
  • Implement security practices like least-privilege IAM and secrets management
  • Diagnose production incidents and lead root-cause analysis
  • Contribute to documentation and improve team processes

Benefits

  • Fully remote work & schedule flexibility
  • Annual bonus program
  • Monthly remote work stipend
  • Annual professional development stipend
  • 24 days PTO per year plus your birthday and year-end shutdown
  • Health, dental, and vision insurance
  • Paid parental leave and sick days
  • RRSP contributions and life insurance
Full Job Description
The opportunity

The Senior Site Reliability Engineer establishes and maintains the infrastructure and operational systems that Thunderbird users and teams depend on every day. You'll design and develop CI/CD systems for MZLA websites, services, and release workflows, diagnose and debug production incidents, and implement improvements to enhance system reliability. We believe that good infrastructure work is invisible when it's going well and invaluable when it isn't.

This role is for someone who treats production as something to be understood, not just kept running. You write things down, flag problems before they become fires, and leave documentation better than you found it. You bring production instincts, infrastructure-as-code fluency, and security awareness that's baked in, not bolted on.

You'll work closely with Software Development Engineers, team members, and community contributors, reporting to the Sr Manager, Platform Infrastructure. This is a great opportunity for someone who thrives with ambiguity, makes good decisions without a complete picture, and cares about Thunderbird's mission: open-source software used by millions who choose privacy and ownership over convenience.

This role requires consistent overlap with Pacific Time zone working hours to enable effective collaboration. You should have availability for regular overlap hours for context sharing with Pacific Time colleagues.

What you'll do
  • Operate and evolve our EKS-based Kubernetes platform, supporting service migrations, platform improvements, and reliability initiatives.
  • Design and develop CI/CD systems supporting websites, services, and Thunderbird desktop releases, contributing to pipeline reliability and OIDC-based authentication across GitHub Actions workflows.
  • Write and maintain infrastructure in Pulumi and/or Terraform/OpenTofu across multiple AWS accounts.
  • Operate and evolve our observability stack (VictoriaMetrics, VictoriaLogs, Grafana, Vector) and partner with engineering teams to incorporate instrumentation and monitoring into service design.
  • Apply security-conscious infrastructure practices, including least-privilege IAM, secrets management via AWS Secrets Manager and External Secrets Operator, and network segmentation.
  • Diagnose and debug production incidents; drive root-cause analysis and post-incident improvements to prevent recurring problems.
  • Participate in on-call rotation and collaborate with SDEs and fellow SREs to ship, maintain, and monitor new builds and support service onboarding.
  • Contribute to runbooks, architecture documentation, and team processes.

What you bring
  • 7+ years of experience in infrastructure, platform engineering, or site reliability roles, including hands-on production Kubernetes experience in workload operations, troubleshooting, and cluster management.
  • Hands-on experience with infrastructure-as-code on AWS using Terraform, OpenTofu, or Pulumi.
  • Security awareness in day-to-day infrastructure work: identity, least privilege, secrets hygiene, and network controls.
  • Demonstrated ownership mindset with the ability to proactively identify issues, drive work to completion, and communicate risks early.
  • Excellent async written communication skills; comfortable working with a geographically distributed team.
  • Ability to collaborate effectively with software engineers and non-engineering stakeholders to improve platform reliability and operational efficiency.
  • Ability to learn, evaluate, and responsibly use emerging technologies, including AI-enabled tools, to improve work processes.

Bonus points for
  • Experience with GitOps workflows (ArgoCD or Flux).
  • Familiarity with Keycloak or similar identity platforms (OIDC, SAML, federation).
  • Knowledge of email protocols and/or experience operating email infrastructure (SMTP, IMAP).
  • Prior work in or alongside an open-source community.
  • French, German, Japanese, or other language proficiency in addition to English.

What success looks like

You'll be successful in this role if you treat production as something to be understood, not just kept running. You write things down, flag problems before they become fires, and leave documentation better than you found it.

You bring production instincts. You've been paged at 2am, you know what good alerting looks like, and you've done the post-mortem work to make sure it doesn't happen the same way twice. You think in code, not in consoles. Your security awareness is baked in, not bolted on. You default to least privilege and ask "what's the blast radius?" before you ship.

You're comfortable with ambiguity. We're a small team building toward something, and you can make good decisions without a complete picture. Thunderbird is open-source software used by millions who choose privacy and ownership over convenience. That matters to you.

Work environment

This is a full-time, fully remote position. You'll join a distributed team of Thunderbird staff, open-source community members, and contributors from around the world.

We rely on clear communication, thoughtful documentation, and collaborative decision-making to work effectively across time zones and disciplines.

Compensation & benefits

We benchmark our base salaries to local markets and target the 60th percentile of the peer market. The salary ranges for this role are:
  • Canada: $108,000 - 125,000 CAD

We may consider candidates with strong skills but less than the required experience. Title, level and compensation will be determined based on qualifications and experience.

In addition to competitive salaries, we offer a comprehensive benefits package designed to support your whole self.

Work & career
  • Fully remote work & schedule flexibility
  • Company-provided laptop
  • Annual bonus program
  • Monthly remote work stipend
  • Annual professional development stipend
  • Industry conferences
  • Company all-hands and team gatherings

Rest & play
  • 24 days PTO per year (prorated)
  • Your birthday
  • Year-end company shutdown
  • 9 wellbeing days
  • Public holidays
  • Other paid leave
  • Quarterly wellbeing stipend for personal / family activities

Health & family
  • RRSP contributions
  • Health, dental, & vision insurance
  • Disability insurance
  • Life insurance
  • Employee assistance program
  • Paid parental leave
  • Paid sick days

Work eligibility

Applicants must reside in and have permanent work authorization for the country location(s) specified in the posting. We are unable to consider applicants outside of these markets at this time. And, we do not provide visa sponsorship.

How to apply

Please apply directly through our career page. We carefully review every cover letter and screening question, so take the time to answer each fully.We value authentic, thoughtful responses that reflect your own experience and perspective. It is fine to use AI tools to polish your writing, but your answers should be your own. Candidates who submit generic or unoriginal AI-generated responses may be disqualified from further consideration.

About Mozilla

Mozilla is a global community of technologists, thinkers, and builders working together to keep the internet open and accessible to all. The company is best known for its flagship product, the Firefox web browser, which is used by millions of people around the world. In addition to its browser, Mozilla also develops a range of other products and services, including a mobile operating system, a password manager, and a virtual private network (VPN) service.
Learn more about Mozilla
Size
1,000 employees
Industry
Founded
1998

Similar Jobs

More Jobs at Mozilla

More Information Technology Jobs

Find similar Senior Site Reliability Engineer jobs: