Staff Site Reliability Engineer

Zscaler • $119K — $170K *

San Jose, CA 95123Remote in United States

Information Technology

5 - 7 years of experience

Reposted 1 month ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

US Citizenship required due to nature of assigned customers
5+ years of experience in a 24/7 NOC or Cloud Operations team
Proficient in Python or Bash programming languages
Strong understanding of networking protocols (HTTP, DNS, TCP/IP, ICMP, OSI Model)
Experience with monitoring tools (e.g. Nagios, Grafana, Prometheus) and network principles (Firewalls, Load Balancing)

Responsibilities

Design, code, and deploy software solutions and automation
Create and deploy scalable monitoring systems for global infrastructure
Monitor applications and participate in on-call rotation
Resolve escalated issues and document processes for automation
Collaborate with teams on integration strategies for continuous improvement

Benefits

Various health plans
Time off for vacation and sick leave
Parental leave options
Retirement savings plans
Education reimbursement
In-office perks

Full Job Description

Role

We are looking for a Staff Site Reliability Engineer to join our team. This role will report to the Senior Manager, Site Reliability Engineering and offers the flexibility of hybrid (3 days a week) out of San Jose, CA, or can be performed fully remote.

As a key member of the Zero Trust Exchange team, you will be responsible for all aspects of the Zscaler production data center services, including servers, operating systems, storage, and supporting systems. You will be an instrumental part of the Site Reliability Engineering team, ensuring the availability, latency, performance, efficiency, and scalability of a cloud that processes tens of billions of transactions daily.

What you'll do (Role Expectations)

Own the reliability of a large-scale cloud service (Linux/BSD, bare metal, Kubernetes, custom load balancing, SD-WAN) by partnering with Engineering and Network teams to define requirements early, conduct operability reviews, and contribute code/design docs for platform resilience
Develop and operate end-to-end observability (metrics/logs/traces, dashboards, alerting) and incident tooling to manage SLOs/error budgets, reduce noise, and improve system detection and diagnosis
Participate in an on-call rotation to lead full-cycle incident response; perform deep cross-stack troubleshooting (OS, networking, distributed systems, packet captures, core dumps) to drive permanent software fixes and codify learnings into runbooks and tests
Build and maintain everything-as-code for fleet and service lifecycle, driving provisioning, configuration, release automation, canary deployments, and complex rollout/rollback workflows
Continuously improve platform hygiene through consistent OS/app upgrades, dependency/vulnerability patching, capacity and performance tuning, and strict CI/CD validation prior to production rollouts

Who You Are (Success Profile)

You act like an owner. Your passion for the mission fuels your bias for action. You operate with integrity because you genuinely care about the outcome. You adapt to what's needed, navigating seamlessly between high-level strategy and hands-on execution.
You are a problem-solver. You seek out challenges because you are energized by finding solutions, knowing that solving the hard problems delivers the biggest impact.
You are a high-trust collaborator. You are ambitious for the team, not just yourself. You embrace our challenge culture by giving and receiving ongoing feedback-knowing that candor delivered with clarity and respect is the truest form of teamwork and the fastest way to earn trust.
You operate with urgency. You understand that in a high-growth environment, speed and quality are not mutually exclusive. You have a relentless focus on execution and a bias for action, delivering high-impact results quickly to win for the customer and the team.
You think at scale. You connect your day-to-day work to the larger company mission and think globally. You build solutions, processes, and teams that are not just effective today but are built to last and support a high-growth, global organization.

What We're Looking for (Minimum Qualifications)

US Citizenship is required (due to the nature of assigned customers) and 5+ years industry experience in software engineering, infrastructure software, and/or platform engineering
Proficiency in at least one programming language (such as Python, Bash, or Go) with demonstrated ability to write production-quality code (testing, code reviews, CI, maintainable design,scripting for diagnostics
Strong Linux/Unix systems fundamentals (process/memory, filesystems, networking stack basics, debugging/perf troubleshooting) and solid understanding of networking protocols and components (e.g., HTTP, DNS, TCP/IP, ICMP, OSI model, subnetting, and load balancing/traffic concepts)
Proven experience operating production services (including incident response, troubleshooting, reducing toil) and ability to participate in on-call rotations and support occasional after-hours or weekend deployments
Managing BSD in production, with a focus on driving systemic fixes through platform engineering

What Will Make You Stand Out (Preferred Qualifications)

Proven expertise in operating Kubernetes at scale
Deep experience with the Prometheus/OpenTelemetry ecosystems, including instrumenting golden signals, defining SLOs, and performing alert tuning to ensure high-availability environments

#LI-KM9 #LI-Remote

Zscaler's salary ranges are benchmarked and are determined by role and level. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations and could be higher or lower based on a multitude of factors, including job-related skills, experience, and relevant education or training.

The base salary range listed for this full-time position excludes commission/ bonus/ equity (if applicable) + benefits.

Base Pay Range

$119,000-$170,000 USD

Our Benefits program is one of the most important ways we support our employees. Zscaler proudly offers comprehensive and inclusive benefits to meet the diverse needs of our employees and their families throughout their life stages, including:

Various health plans
Time off plans for vacation and sick time
Parental leave options
Retirement options
Education reimbursement
In-office perks, and more!

Learn more about Zscaler's Future of Work strategy, hybrid working model, and benefits here.

About Zscaler

Zscaler is a cloud-based information security company that provides Internet security, web security, firewalls, sandboxing, SSL inspection, antivirus, vulnerability management and granular control of user activity in cloud computing, mobile and Internet of things environments. The company is headquartered in San Jose, California, and has offices in Australia, India, Japan, Singapore, the United Kingdom, and the United States.

Learn more about Zscaler

Size

3,153 employees

Market Cap

$15.5 billion

Industry

Information Technology

Net Income

-$191.4 million

Founded

2008

5 Year Trend

+54.1%

Revenue

$536 million

NASDAQ

* Ladders Estimates

Similar Jobs

HPC Systems Engineer
$136K — $231K *
KLA Tencor
Milpitas, CA 95035 (Santa Clara County)
Reposted Today
Member of Technical Staff - Compute Cluster
$130K — $180K *
Causal Labs
San Francisco, CA 94112 (San Francisco County)
Yesterday
Sr. Project Engineer, SDS
$100K — $130K *
Fujifilm Manufacturing USA, Inc
Remote
Reposted Yesterday
Systems Development Engineer, Edge AI Platform Infrastructure, Hardware Compute Group
$148K — $201K *
Amazon
Sunnyvale, CA 94087 (Santa Clara County)
Yesterday
Staff Systems Engineer
$120K — $160K *
Form Energy
Berkeley, CA 94704 (Alameda County)
Reposted Yesterday
Senior Infrastructure Engineer
$120K — $150K *
alter Domus
Carmel, CA 93923 (Monterey County)
Reposted Yesterday

Get Ready For Your
Next Interview

More Jobs at Zscaler

Commercial Sales Engineer - NY/NJ
$120K — $150K *
Remote
Yesterday
Information Technology
Remote in New Jersey, US
Commercial Sales Engineer - NY/NJ
$120K — $150K *
New York, NY 10025 (New York County)
Yesterday
Information Technology
In-Person
Senior Sales Engineer, Majors - PacNW
$147K — $221K *
Remote
2 days ago
Information Technology
Remote
Workday Integration Developer
$134K — $168K *
Remote
2 days ago
Enterprise Technology
Remote in Canada
Principal Peering Coordinator
$196K — $245K *
Remote
2 days ago
Telecommunications & Hardware
Remote in United States

More Information Technology Jobs

Chief Executive Officer
The Mitalmor Group
San Francisco, CA 94102 (San Francisco County)
2 weeks ago
IT CYBERSECURITY SPECIALIST (INFOSEC)
$75K — $95K *
Army National Guard Units
Colchester, VT 05446 (Chittenden County)
Today
IT SPECIALIST (INFOSEC/APPSW)
$90K — $120K *
Commander, Naval Information Warfare Systems Command
San Diego, CA 92154 (San Diego County)
Reposted Today
Change Management Quality Assurance Analyst
$86K — $181K *
CACI International
Sterling, VA 20164 (Loudoun County)
Reposted Today
CLEVELAND Engineer, Data Warehouse (Architecture) - Information Technology (IT)
$120K — $155K *
Jones Day
Cleveland, AL 35049 (Blount County)
Today

Find similar Staff Site Reliability Engineer jobs:

Nationwide San Jose, CA

Staff Site Reliability Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Staff Site Reliability Engineer jobs:

Get Ready For Your
Next Interview