Principal Site Reliability Engineer, Google Cloud

Saviynt • $240K — $250K *

Milpitas, CA 95035Hybrid

Information Technology

8 - 10 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

9+ years in Infrastructure Development, Platform Engineering, or Site Reliability Engineering
Deep expertise in Kubernetes, both single-tenant and multi-tenant
Strong programming skills in Go (Golang) and Python
Experience with major Cloud Providers, especially GCP
Proven track record in designing Event-Driven Architecture and message queuing systems
Solid knowledge of CI/CD tools, particularly GitLab CI
Experience in designing and operating Distributed Systems

Responsibilities

Design and maintain shared infrastructure services for product teams
Create scalable, reusable solutions in a multi-cloud environment
Architect and manage highly available Kubernetes platforms
Develop tools for infrastructure provisioning using Go
Implement centralized Observability platforms for insights
Optimize our multi-region cloud infrastructure for performance
Collaborate with product teams to address infrastructure needs

Benefits

Comprehensive security and privacy training upon onboarding and annually
Adherence to Information Security/Privacy Policies and Procedures

Full Job Description

Why This Role Matters

Saviynt's platform is mission-critical for our customers. As we scale globally, reliability, availability, and performance are not optional-they are core product features.

As a Principal Engineer, you will define and drive the reliability strategy for our SaaS platform. This is a high-impact, hands-on engineering role with broad influence across infrastructure, platform, and application teams. You will shape how Saviynt designs, operates, and measures reliability at scale.

This role is ideal for engineers who want to work on hard reliability problems, influence architecture across teams, and leave a lasting mark on a growing SaaS platform.

What You Will Do

In this pivotal role, you will be instrumental in designing, building, and maintaining the shared infrastructure services and platforms that our product and application teams will depend on
• You will focus on creating reusable, reliable, and scalable solutions that abstract away complexity, enabling other teams to focus on their core business logic and deliver features faster in a multi-cloud environment
• Design and build core platform components and shared infrastructure services that other development teams will integrate with and leverage to deploy and operate their applications
• Architect, implement, and manage highly available and scalable Kubernetes platforms as a service for internal consumers
• Develop robust, internal-facing tools and automation for infrastructure provisioning and management primarily using Go (Golang)
• Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.), focusing on creating reusable patterns and modules for other teams
• Design and implement shared Event-Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub that product teams can easily utilize
• Develop and maintain robust CI/CD pipelines (e.g., GitLab CI and ArgoCD) as a service, providing standardized and automated deployment workflows for various development teams
• Design and build resilient Distributed Systems components that serve as building blocks for other applications, focusing on reliability, fault tolerance, and performance
• Manage and optimize our shared infrastructure across Multi-Region Cloud Environments, ensuring that platform services are globally available and performant for all consumers
• Establish and enhance centralized Observability and Monitoring platforms and tools that provide self-service insights for consuming teams
• Define and implement clear, well-documented RESTful API designs for the infrastructure services you build, ensuring ease of integration for internal clients
• Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities, providing traffic management, security, and policy enforcement as a shared platform for services
• Design, implement, and optimize highly available Relational Database services or shared data platforms for broad organizational use
• Collaborate closely with product development teams to understand their infrastructure needs and pain points, providing technical guidance and support
• Participate in on-call rotations to support the critical shared infrastructure you build

What Are We Looking For
• 9+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers
• Deep expertise with Kubernetes in production environments, particularly in providing it as a platform(i.e single tenant and multi-tenant deployment architectures)
• Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation
• Extensive hands-on experience with at least one major Cloud Provider (GCP is a must); multi-cloud experience is a strong plus, especially in building abstractions over them.
• Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services
• Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams
• Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components
• Familiarity with Multi-Region Cloud Environments and strategies for building globally distributed and highly available platform
• Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure
• Strong experience with RESTful API design principles and building well-documented, consumable APIs
• Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context
• Hands-on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service
• Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non-technical audiences
• A strong customer-centric mindset, treating internal development teams as your primary customers
• Advanced Professional GCP Certification is required.
• Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required

$240,000 - $250,000 a year

If required for this role, you will:

- Complete security & privacy literacy and awareness training during onboarding and annually thereafter

- Review (initially and annually thereafter), understand, and adhere to Information Security/Privacy Policies and Procedures such as (but not limited to):

> Data Classification, Retention & Handling Policy

> Incident Response Policy/Procedures

> Business Continuity/Disaster Recovery Policy/Procedures

> Mobile Device Policy

> Account Management Policy

> Access Control Policy

> Personnel Security Policy

> Privacy Policy

About Saviynt

Saviynt is a leading provider of cloud identity and access governance solutions. Saviynt enables enterprises to secure applications, data, and infrastructure in a single platform for Cloud (Office 365, AWS, Azure, Salesforce, Workday) and Enterprise (SAP, Oracle EBS). Saviynt is pioneering Identity 3.0 by integrating advanced risk analytics and intelligence with fine-grained privilege management. Top global brands leverage Saviynt technology. Saviynt is headquartered in Irvine, California with offices in Chicago, New York, Toronto, London, and Hyderabad, India.

Learn more about Saviynt

Size

500 employees

Industry

Information Technology

Founded

2010

* Ladders Estimates

Similar Jobs

Principal Site Reliability Engineer
$200K — $250K *
DraftKings
Remote
Yesterday
Staff Infrastructure Engineer (IAM and Cloud Services)
$124K — $271K *
Zoom Video Communications, Inc.
San Jose, CA 95123 (Santa Clara County)
2 days ago
Senior SoC Subsystem and I/O Architect - LPU
$184K — $356K *
NVIDIA Corporation
Remote
4 days ago
Principal Field Application Engineer - Edge AI
$184K — $249K *
ARM
San Jose, CA 95123 (Santa Clara County)
6 days ago
Staff Platform Engineer
$220K — $331K *
Amplitude
San Francisco, CA 94112 (San Francisco County)
1 week ago
Senior Staff Engineer, SoC Architecture - Memory Subsystem and Interconnect
$197K — $296K *
Samsung Electronics Co., Ltd.
Mountain View, CA 94040 (Santa Clara County)
1 week ago

Get Ready For Your
Next Interview

More Jobs at Saviynt

Principal Site Reliability Engineer, Google Cloud
$240K — $250K *
Milpitas, CA 95035 (Santa Clara County)
Today
Information Technology
Hybrid
Project Manager - Government Public Service & Federal Expert Services
$100K — $130K *
Remote
Today
Education, Government & Non-Profit
Remote in United States
Principal Software Engineer, Connectors
$225K — $245K *
Milpitas, CA 95035 (Santa Clara County)
2 weeks ago
Information Technology
Hybrid
Principal Software Engineer, AI Platform Engineering
$140K — $180K *
El Segundo, CA 90245 (Los Angeles County)
2 weeks ago
Enterprise Technology
Hybrid
Strategic Account Executive - Chicago
$170K — $180K *
Chicago, IL 60629 (Cook County)
3 weeks ago
Enterprise Technology
In-Person

More Information Technology Jobs

Sr. System Administrator (onsite)
$68K — $131K *
Raytheon Technologies
Andover, MA 01810 (Essex County)
Today
Durability Data Analytics Principal Engineer P4 (Onsite)
$107K — $204K *
Raytheon Technologies
East Hartford, CT 06118 (Capitol County)
Today
Systems Analysis Advisor
$106K — $176K *
Cigna
Remote
Today
Software Engineering Senior Manager - Hybrid
$138K — $230K *
Cigna
Whitestown, IN 46075 (Boone County)
Today
Senior Architect (IT Principal), Enterprise Architecture - Contact Center & Service Transformation
$139K — $233K *
Bloomfield, CT 06002 (Capitol County)
Reposted Today

Find similar Principal Site Reliability Engineer, Google Cloud jobs:

Nationwide Milpitas, CA

Principal Site Reliability Engineer, Google Cloud

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Principal Site Reliability Engineer, Google Cloud jobs:

Get Ready For Your
Next Interview