RoleWe are looking for a Principal Software Engineer (Service Platform & Orchestration) to join our team. This is a Hybrid (3 days in office) role, reporting to the VP, Engineering in the Zero Trust Exchange department. In this high-ownership position, you will engineer the management plane and reliability systems that govern ZIA's fleet lifecycle at massive scale.
This is a hands-on building role where you will lead the transformation of our infrastructure from legacy automation into a stateful, durable management plane (built on Temporal) to achieve deterministic "one-touch" provisioning and lifecycle operations. You will treat "infrastructure as a distributed system," developing self-healing capabilities and AI-driven SRE practices for a global fleet of 100k+ instances.
What you'll do (Role Expectations)- Lead the hands-on development and migration to a workflow-as-code platform (Temporal), building replay-safe, idempotent workflows that ensure deterministic operations across a global scale
- Move the organization beyond "scripted automation" toward a robust management plane that treats the entire global fleet as a single, eventually-consistent distributed system
- Design and implement services that leverage LLMs and ML for intelligent signal correlation, automated triage, and "self-correcting" fleet operations
- Develop framework-level services and internal APIs that ensure all new products are delivered "orchestration-ready" with reliability hooks built directly into the code
- Build deep telemetry (metrics, traces, and events) into the management plane so that every fleet-wide action is fully explainable, auditable, and replayable
Who You Are (Success Profile)- You thrive in ambiguity. You're comfortable building the path as you walk it. You thrive in a dynamic environment, seeing ambiguity not as a hindrance, but as the raw material to build something meaningful.
- You act like an owner. Your passion for the mission fuels your bias for action. You operate with integrity because you genuinely care about the outcome. You adapt to what's needed, navigating seamlessly between high-level strategy and hands-on execution.
- You operate with urgency. You understand that in a high-growth environment, speed and quality are not mutually exclusive. You have a relentless focus on execution and a bias for action, delivering high-impact results quickly to win for the customer and the team.
- You think at scale. You connect your day-to-day work to the larger company mission and think globally. You build solutions, processes, and teams that are not just effective today but are built to last and support a high-growth, global organization.
- You are resilient and adaptable. You view change as an opportunity and setbacks as temporary. You maintain composure and focus in high-pressure situations, guiding yourself and your team through complexity with a steady, positive hand.
What We're Looking for (Minimum Qualifications)- Foundational understanding of AI/ML technologies and experience leveraging, securing, or positioning AI-driven solutions to optimize outcomes within your functional domain
- Demonstrated curiosity and active exploration of AI tools, with a proven history of integrating new technologies to enhance daily workflows and augment problem-solving
- BS/MS in Computer Science or a related technical field with 10+ years of experience in hyperscale systems, with a deep understanding of the unique failure modes and technical hurdles that only emerge at massive scale
- Mastery of backend systems languages (Go, Java, Python, or others) with a proven ability to set the bar for code quality, maintainability, and distributed system correctness
- Strong experience designing and operating complex distributed systems, with a focus on solving systemic challenges in concurrency, failure handling, and performance optimization
- Proven track record of developing Platform APIs (REST/gRPC) with strong guarantees for idempotency, verification, and safe rollout patterns
What Will Make You Stand Out (Preferred Qualifications)- Proficiency with AI code-assistance tools (e.g., Cursor, Windsurf) to accelerate legacy refactoring and system development
- Proficiency in PostgreSQL, or other relational stores used for high-scale, stateful management-plane services
- Direct experience building or operating systems with Temporal.io, Cadence, or similar workflow engines
#LI-YC2 #LI-Hybrid
Zscaler's salary ranges are benchmarked and are determined by role and level. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position across all US locations and could be higher or lower based on a multitude of factors, including job-related skills, experience, and relevant education or training.
The base salary range listed for this full-time position excludes commission/ bonus/ equity (if applicable) + benefits.
Base Pay Range
$212,000-$265,000 USD
Our Benefits program is one of the most important ways we support our employees. Zscaler proudly offers comprehensive and inclusive benefits to meet the diverse needs of our employees and their families throughout their life stages, including:
- Various health plans
- Time off plans for vacation and sick time
- Parental leave options
- Retirement options
- Education reimbursement
- In-office perks, and more!
Learn more about Zscaler's hybrid working model and benefits here.