Pharmavite

Site Reliability Engineer II

Pharmavite$148K — $222K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of experience in Site Reliability Engineering or related fields.
  • Hands-on experience with Kubernetes for deploying and managing workloads.
  • Proficient in Terraform for infrastructure as code, ideally with reusable modules.
  • Solid understanding of CI/CD automation, experience with GitHub Actions preferred.
  • Practical knowledge of networking concepts relevant to cloud environments.
  • Experience with observability tools and concepts for effective incident response.
  • Familiarity with relational databases, PostgreSQL experience is a plus.

Responsibilities

  • Evolve AWS infrastructure using Terraform with an emphasis on reusable patterns.
  • Manage and optimize Kubernetes workloads for reliability and performance.
  • Develop and maintain CI/CD pipelines for efficient code deployment.
  • Collaborate with product teams to enable self-service infrastructure access.
  • Debug production issues effectively, focusing on various tech stacks.
  • Implement observability standards and instrumentation to monitor services.
  • Contribute to the creation of automation and runbooks to streamline operations.

Benefits

  • Flexible hybrid working model for better work-life balance.
  • Encouragement of in-person collaboration at least twice weekly.
  • Culture built on collaboration, innovation, and performance.
  • Opportunities for team learning and community engagement.
Full Job Description
Senior Site Reliability Engineer

The Team

You will join our Site Reliability Engineering function within the Governance Compliance and Insights (GCI) pillar, a team that sits at the centre of how Mimecast builds, ships, and operates its cloud platform. We do three things, and we do them at scale:
  • Cloud infrastructure - design, provision, and evolve the AWS foundations our product teams build on.
  • CI/CD - build and maintain the pipelines that move code safely and frequently from commit to production.
  • Observability - provide the logs, metrics, and traces that make our platform debuggable and trustworthy.


We hold a few beliefs about how that work should be done:
  • Continuous deployment. Smaller chunks of work, shipped more frequently, with the safety rails to make that the default - not the exception.
  • Builders own what they build. We work relentlessly to enable product teams to safely and auditably access their own infrastructure, databases, and pipelines. Our job is to help teams help themselves, not to become a bottleneck.
  • Process and standardization are paramount. We avoid one-offs and special cases at almost any upfront cost, because repeatability compounds.


AI-First Engineering at Mimecast

Mimecast is an AI-First engineering organization. Our teams actively leverage AI-powered development tools across all facets of engineering, from code development to testing, documentation, and operations. We're looking for leaders who don't just use AI tools but champion their adoption and establish new ways of working.

Our AI leadership extends beyond how we build to what we build. Our Mihra AI agent delivers 7x faster threat response for customers, and we're recognized as "Agents of Change" in Human Risk Management. Engineers here work at the intersection of cutting-edge AI tooling and AI-powered security products that protect organizations worldwide.

The Role

As a Site Reliability Engineer, you will be a hands-on engineer who stands up infrastructure, debugs it when it misbehaves, and turns both activities into repeatable, self-service patterns for the engineering teams we support. The ability to provision infrastructure and debug complex distributed systems is our highest concern for this role. CI/CD expertise sits close behind - while observability knowledge is useful but increasingly centralized via shared platforms.

You will work in a small, focused team and in close partnership with the product engineering squads you enable. Expect to rotate between greenfield infrastructure build-outs, pipeline and tooling improvements, and the occasional deep incident investigation.

What You'll Do:
  • Stand up and evolve AWS infrastructure using Terraform, with a strong bias toward reusable modules and paved-road patterns over bespoke solutions.
  • Operate and improve Kubernetes-based workloads - deployments, scaling, networking, and the platform glue that makes them boring to run.
  • Build and maintain CI/CD pipelines (GitHub Actions preferred) that give engineering teams fast, safe, auditable paths to production.
  • Partner with product squads to enable self-service access to their own infrastructure, databases, and pipelines - with the guardrails, auditability, and standards that make that access safe.
  • Debug production issues across the stack: networking, DNS, certificates, container orchestration, CI pipelines, and application-level behaviour.
  • Instrument services with appropriate logs, metrics, and traces, and help teams adopt the observability standards our platform teams define.
  • Contribute to runbooks, automation, and standards that reduce toil and one-off work - if you fix it twice, codify it the third time.
  • Use AI tooling pragmatically and at a professional level: to accelerate code generation, infrastructure design, debugging, documentation, and review. We expect engineers to be AI-first in how they approach their craft.


What You'll Bring:

Core Technical Skills
  • Hands-on Kubernetes experience - you can deploy, operate, and debug workloads on Kubernetes. You do not need to be an expert, but you should be comfortable in it.
  • Terraform experience - you have written and maintained non-trivial Terraform. Again, expertise is not required; pragmatic competence is.
  • Familiarity with setting up CI and deployment automation. Experience with GitHub Actions is strongly preferred; experience with Jenkins, GitLab CI, or similar is transferable.
  • A working understanding of observability fundamentals - logs, metrics, and traces - and how they are used during incident response.
  • Networking knowledge at a practical level: security groups, TLS certificates, DNS, load balancing, and how traffic actually flows through a cloud environment.
  • Experience working with relational databases is helpful - PostgreSQL is ideal but not required.
  • Experience with AWS or similar cloud platforms

AI-First Engineering

Mimecast is an AI-first engineering organization. We expect every engineer to use modern AI tools (coding assistants, reasoning models, agentic tools) as a core part of their daily workflow - for design, implementation, debugging, review, and documentation. For this role specifically, we are looking for:
  • Demonstrable experience prompting and working with AI coding and reasoning tools to produce real, shipped work.
  • Good judgement about when AI output is trustworthy and when it needs verification - particularly for infrastructure code and production-impacting changes.
  • An interest in applying AI tooling to SRE problems: incident triage, runbook generation, log analysis, Terraform authoring, and similar.

Ways of Working
  • You collaborate well across teams - you can work with product engineers, platform teams, and security peers without friction.
  • You have a bias for action and problem-solving, and you prefer shipping small, frequent improvements over big-bang changes.
  • You are comfortable saying "let's standardize this" rather than building a one-off, even when the one-off would be faster today.
  • You communicate clearly in writing - our teams are distributed, and async clarity matters.


Our Hybrid Model:

We provide you with the flexibility to live balanced, healthy lives through our hybrid working model that champions both collaborative teamwork and individual flexibility. Employees are expected to come to the office at least two days per week, because working together in person:
  • Fosters a culture of collaboration, communication, performance and learning.
  • Drives innovation and creativity within and between teams.
  • Introduces employees to priorities outside of their immediate realm.
  • Ensures important interpersonal relationships and connections with one another and our community!


The base salary range for this position is $148,000-$222,000 plus benefits. This range represents the minimum and maximum new hire compensation for this role. The position may also be eligible for incentive plans and additional benefits, in accordance with company policy and local regulations. Our salary ranges are determined by role, level, and location with individual compensation also dependent on factors such as qualifications, experience, and skills. Final offers will reflect these considerations and may vary accordingly.

#LI-CS1

About Pharmavite

Pharmavite is a dietary supplements company that produces vitamins, minerals, and other nutritional supplements. The company was founded in 1971 and is headquartered in Northridge, California. Pharmavite's products are sold under several brand names, including Nature Made, a leading brand of vitamins and supplements in the United States. The company is committed to sustainability and has implemented several initiatives to reduce its environmental impact, such as using renewable energy and reducing waste. Pharmavite is a subsidiary of Otsuka Pharmaceutical, a Japanese pharmaceutical company.
Learn more about Pharmavite
Size
2,000 employees
Industry
Founded
1971

Similar Jobs

More Jobs at Pharmavite

More Information Technology Jobs

Find similar Site Reliability Engineer II jobs: