Principal Production Engineer

Zscaler • $164K — $235K *

US-AnywhereRemote in San Jose, CA

Enterprise Technology

8 - 10 years of experience

1 month ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

10+ years managing reliability and scalability for large-scale production services
Deep expertise in programming languages such as Python, Go, or C/C++
Strong background in networking protocols and Linux/RHEL systems
Experience with high-stakes incident management and 24/7 on-call rotations
Proficiency in ITIL frameworks and systematic problem management

Responsibilities

Design and implement scalable infrastructure across AWS, GCP, and bare-metal environments
Drive an 'automation-first' culture by coding to eliminate manual processes
Implement and maintain sophisticated observability metrics and error budgets
Lead incident response efforts and develop response playbooks
Collaborate on operability reviews with engineering teams

Benefits

Various health plans
Time off for vacation and sick leave
Parental leave options
Retirement plans
Education reimbursement
In-office perks, and more!

Full Job Description

Role

We are looking for a Principal Production Engineer to join our team. This role is available as a hybrid opportunity 3 days a week in San Jose, CA or Remote reporting to Production Engineering in the Cloud Infrastructure & Operations department. Join Zscaler to be a force multiplier for the reliability of a global platform processing 200+ billion transactions daily across tens of millions of enterprise users.

In this role, you will provide the technical vision and hands-on execution to drive an "automation-first" culture across the company. By maturing our observability and architectural standards, you will directly reduce our Mean Time to Mitigate (MTTM) and shape the scalability of our globally distributed, multi-cloud infrastructure.

What you'll do (Role Expectations)

Design and implement highly available, scalable infrastructure across AWS, GCP, and bare-metal environments
Drive an "automation-first" culture by writing code (Python/Go) to eliminate manual toil and build self-healing systems
Implement and maintain sophisticated observability (Prometheus, Grafana, OpenTelemetry), define SLIs/SLOs, and establish error budgets
Act as a lead Incident Commander (TDO on-call), develop response playbooks, and conduct deep-dive post-incident analyses
Partner with Engineering and partner teams to conduct operability reviews

Who You Are (Success Profile)

You act like an owner with a bias for action and integrity.
You are a pragmatic builder obsessed with creating, iterating, and shipping.
You champion simplicity by distilling complex problems into clear, actionable plans.
You are data-driven, valuing evidence over assumptions.
You think at scale, building solutions and processes built to last a high-growth global organization.

What We're Looking for (Minimum Qualifications)

10+ years of experience managing reliability, scalability, and availability for large-scale production services
Deep expertise in programming (e.g., Python, Go, or C/C++)
Strong background in networking protocols, Linux/RHEL systems, and distributed architecture
Experience in high-stakes incident management and participation in a 24/7 on-call rotation
Proficiency in leveraging ITIL frameworks and incident data to drive service maturity through systematic problem management and technical operability reviews

What Will Make You Stand Out (Preferred Qualifications)

Extensive experience with public cloud (AWS, Azure, GCP) and Infrastructure-as-Code (Ansible, Terraform, Helm, Temporal)
Experience with chaos engineering and disaster recovery planning at scale
Expertise in global routing (BGP) and traffic tunneling (GRE, IPSec) with a deep understanding of L7 proxy architectures (HAProxy), DNS at scale, and OS networking stack internals

#LI-Hybrid #LI-RT101

Zscaler's salary ranges are benchmarked and are determined by role and level. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations and could be higher or lower based on a multitude of factors, including job-related skills, experience, and relevant education or training.

The base salary range listed for this full-time position excludes commission/ bonus/ equity (if applicable) + benefits.

Base Pay Range

$164,500-$235,000 USD

Our Benefits program is one of the most important ways we support our employees. Zscaler proudly offers comprehensive and inclusive benefits to meet the diverse needs of our employees and their families throughout their life stages, including:

Various health plans
Time off plans for vacation and sick time
Parental leave options
Retirement options
Education reimbursement
In-office perks, and more!

Learn more about Zscaler's hybrid working model and benefits here.

About Zscaler

Zscaler is a cloud-based information security company that provides Internet security, web security, firewalls, sandboxing, SSL inspection, antivirus, vulnerability management and granular control of user activity in cloud computing, mobile and Internet of things environments. The company is headquartered in San Jose, California, and has offices in Australia, India, Japan, Singapore, the United Kingdom, and the United States.

Learn more about Zscaler

Size

3,153 employees

Market Cap

$15.5 billion

Industry

Information Technology

Net Income

-$191.4 million

Founded

2008

5 Year Trend

+54.1%

Revenue

$536 million

NASDAQ

* Ladders Estimates

Similar Jobs

Principal DevOps Engineer
$175K — $215K *
Ursa Major
Berthoud, CO 80513 (Larimer County)
Yesterday
Staff/Principal DevOps Engineer, AI Inference
$192K — $272K *
Lila Sciences
Cambridge, MA 02139 (Middlesex County)
3 days ago
Principal DevOps Engineer
$144K — $210K *
AEG Worldwide
Los Angeles, CA 90011 (Los Angeles County)
3 days ago
DevOps Engineer/ Principal DevOps Engineer
$91K — $171K *
Northrop Grumman
Aurora, CO 80013 (Arapahoe County)
1 week ago
Principal Engineer Hybrid Cloud Operations
$147K — $191K *
Kaiser Permanente
Greensboro, NC 27406 (Guilford County)
2 weeks ago
Principal DevOps Engineer
$162K — $225K *
Headspace
Remote
2 weeks ago

Get Ready For Your
Next Interview

More Jobs at Zscaler

Account Executive - Majors - Dallas
$165K — $205K *
Dallas, TX 75217 (Dallas County)
Today
Enterprise Technology
In-Person
Principal AI Engineer
$185K — $265K *
Remote
Today
Enterprise Technology
Remote in United States
Account Executive - Majors - Atlanta
$165K — $205K *
Alpharetta, GA 30022 (Fulton County)
Today
Enterprise Technology
In-Person
Account Executive - Majors - Chicago
$165K — $205K *
Remote
Today
Enterprise Technology
Remote in Illinois, US
Principal AI Product Manager
$171K — $245K *
Remote
Today
Enterprise Technology
Remote in United States

More Enterprise Technology Jobs

Senior Salesforce Platform Business Partner
$150K — $170K *
Confidential Company
Boston, MA 02108 (Suffolk County)
3 days ago
GIS Engineering Lead / Enterprise GIS Integration Architect (Ref: 5770)
$110K — $130K *
Estrada Consulting, Inc.
Atlanta, GA 30349 (Fulton County)
Today
Senior Director, Data Portfolio Management
$160K — $200K *
KPMG
Atlanta, GA 30349 (Fulton County)
Today
SAP FICO Consultant
$110K — $130K *
Infosys
Spring, TX 77379 (Harris County)
Reposted Today
Senior Product Manager - Technical, Amazon Elastic Block Store (EBS)
$152K — $205K *
Amazon
Seattle, WA 98115 (King County)
Reposted Today

Find similar Principal Production Engineer jobs:

Nationwide Remote

Principal Production Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Principal Production Engineer jobs:

Get Ready For Your
Next Interview