RoleWe are looking for a Sr. Staff Software Engineer to join our Service Platform Automation team. This role offers flexibility to work a hybrid schedule (three days a week onsite) in San Jose, CA, reporting to the VP of Engineering. In this high-ownership position, you will build and operate the orchestration and reliability automation that manages ZIA's fleet lifecycle at massive scale. You will initially focus on leading the architectural transformation of legacy scripts into a safe, deterministic, Temporal-based orchestration platform to achieve "one-touch" provisioning. As you scale the platform, you will expand the team's mission into AI SRE practices, applying software engineering to identify and solve systemic inefficiencies and build self-healing capabilities across our global fleet.
What you'll do (Role Expectations)- Drive the migration from legacy scripts to a Temporal-based platform, engineering replay-safe workflows with built-in retries, idempotency, and safe rollback designs for one-touch fleet operations
- Identify and solve systemic inefficiencies across our global fleet, engineering technical solutions needed to make our operations more autonomous
- Build systems that leverage LLMs and ML for intelligent triage, global signal correlation, and automated runbooks to eliminate manual toil
- Develop framework-type services for feature teams, ensuring all new products are delivered "automation-ready" with reliability hooks built directly into the code
- Ensure every fleet-wide action is fully explainable, replayable, and auditable by implementing comprehensive metrics, traces, and event logging
Who You Are (Success Profile)- You thrive in ambiguity. You're comfortable building the path as you walk it. You thrive in a dynamic environment, seeing ambiguity not as a hindrance, but as the raw material to build something meaningful.
- You act like an owner. Your passion for the mission fuels your bias for action. You operate with integrity because you genuinely care about the outcome. True ownership involves leveraging dynamic range: the ability to navigate seamlessly between high-level strategy and hands-on execution.
- You are a problem-solver. You love running towards challenges because you are laser-focused on finding the solution, knowing that solving the hard problems delivers the biggest impact.
- You are a high-trust collaborator. You are ambitious for the team, not just yourself. You embrace our challenge culture by giving and receiving ongoing feedback-knowing that candor delivered with clarity and respect is the truest form of teamwork and the fastest way to earn trust.
- You are a learner. You have a true growth mindset and are obsessed with your own development, actively seeking feedback to become a better partner and a stronger teammate. You love what you do and you do it with purpose.
What We're Looking for (Minimum Qualifications)- BS or MS in Computer Science or a related technical field with 10+ years of experience in hyperscale systems, with a deep understanding of the unique failure modes and technical hurdles that only emerge at massive scale
- Mastery of backend systems languages (Go, Java, Python, or others) with a proven ability to set the bar for code quality, maintainability, and distributed system correctness
- Experience designing and operating complex distributed systems, with a focus on solving systemic challenges in concurrency, failure handling, and performance optimization
- Expertise in building automation using REST APIs and Swagger with strong guarantees for idempotency, verification, and safe rollout patterns
- Expertise in engineering and operating hybrid infrastructure across cloud platforms (AWS/GCP, GKE) and on-premise environments, ensuring consistent container orchestration and CI/CD safety
What Will Make You Stand Out (Preferred Qualifications)- Experience building or operating AI-enabled developer/ops tooling with measurable improvements in triage speed and operational efficiency
- Experience in testing orchestration systems, including determinism verification, fault injection, and chaos engineering
- Proficiency in PostgreSQL, including SQL development and schema management, to power high-scale, stateful management-plane services
#LI-Hybrid #LI-YC2
Zscaler's salary ranges are benchmarked and are determined by role and level. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations and could be higher or lower based on a multitude of factors, including job-related skills, experience, and relevant education or training.
The base salary range listed for this full-time position excludes commission/ bonus/ equity (if applicable) + benefits.
Base Pay Range
$176,000-$220,000 USD
Our Benefits program is one of the most important ways we support our employees. Zscaler proudly offers comprehensive and inclusive benefits to meet the diverse needs of our employees and their families throughout their life stages, including:
- Various health plans
- Time off plans for vacation and sick time
- Parental leave options
- Retirement options
- Education reimbursement
- In-office perks, and more!
Learn more about Zscaler's Future of Work strategy, hybrid working model, and benefits here.