Principal Software Engineer, AI Platform Engineering

Saviynt • $140K — $180K *

El Segundo, CA 90245Hybrid

Enterprise Technology

8 - 10 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

8+ years of production-scale data engineering experience across multiple companies
Experience defining platform standards adopted organization-wide
Ownership of a production data lake - design and operation
Proficient in Spark (PySpark/Scala) with executor tuning and Iceberg maintenance
Hands-on experience with Apache Beam and Dataflow for batch and streaming pipelines
Experienced with schema registry formats like Protobuf/Avro
Familiar with multi-tenant data architecture and isolation requirements
Operational experience with feature stores and vector databases

Responsibilities

Set architectural standards for training data governance
Design and manage the AI Data Lake with tiered storage and lifecycle rules
Implement batch and streaming data pipelines using Spark and Beam
Establish and maintain a schema registry for data evolution
Oversee orchestration using Flyte and evaluate related tools
Develop multi-tenant isolation strategies for data access
Create microservices for data anonymization and labeling

Benefits

Opportunity to work on a large-scale Kubernetes-based SaaS platform
Engage with complex cloud and reliability challenges
Collaborate with skilled engineers in a reliability-focused culture
Access to competitive benefits and growth opportunities

Full Job Description

ABOUT THE ROLE

You set the architectural direction for how training data flows, evolves, and is governed across the AI Platform. You define the standards ML engineers and scientists build on, and ensure every training signal is tenant-isolated, PII-free, and traceable from source to model.

WHAT YOU'LL OWN

AI Data Lake on GCS: bucket layout, raw - silver - gold tier separation, CMEK encryption, lifecycle rules
Batch pipelines: Spark on Dataproc for TB-scale feature backfills, Iceberg compaction, and daily S3-GCS incremental sync
Streaming pipelines: Apache Beam on Dataflow for sub-5-min CDC ingestion with exactly-once semantics and PII assertion gates
Schema registry: Avro / Protobuf schema versioning, compatibility modes, and migration playbooks for safe schema evolution
Orchestration: Flyte as primary DAG layer - task authoring standards, domain isolation, retry policies, DataCatalog memoization; evaluate Kubeflow Pipelines where relevant
Multi-tenancy: strict per-tenant GCS prefix isolation, quota policies, and cross-tenant contamination validation
Data Anonymizer and Data Labeler microservices: strip PII and attach ML labels before signals leave each customer environment
Feature store: Feast offline (GCS Parquet) and online (Redis) with point-in-time correctness and < 0.1% consistency SLA
Vector database: operate Pgvector (Cloud SQL) for POC and Qdrant on GKE for production-scale embedding storage; design index strategies (IVFFlat, HNSW) and manage ANN query latency SLAs
RAG data pipeline: build embedding generation pipelines that chunk, encode, and upsert document embeddings into the vector store; own the data refresh cadence and staleness SLAs for retrieval context
Service APIs: expose data platform services (feature serving, embedding upsert, schema validation) over HTTPS with mTLS and gRPC where low-latency streaming is required
Synthetic data pipelines for dev/staging where real customer data is not permitted
Data quality gates: Great Expectations / dbt checks as Flyte tasks, blocking on schema and PII-absence failures

YOU'LL THRIVE HERE IF YOU HAVE

8+ years of data engineering at production scale across multiple companies
Demonstrated principal impact: platform standards you defined adopted org-wide, or major cross-team pipeline/schema migrations you led
Data lake ownership (essential): you have designed and operated a production data lake end-to-end - storage layout, partitioning strategy, tiered retention (hot/warm/cold), table format (Iceberg or Delta Lake), compaction, and access control; not just consumed one
Deep Spark (PySpark / Scala): executor tuning, shuffle diagnosis, Iceberg table maintenance
Hands-on Beam / Dataflow: windowing, exactly-once, side inputs, autoscaling
Schema registry experience: Protobuf / Avro compatibility rules, breaking-change migrations in production
Orchestration at scale: Flyte, Kubeflow Pipelines, Airflow, or Prefect - operated in production, ideally benchmarked two
Multi-tenant data architecture: per-tenant isolation as a hard requirement, not a post-hoc concern
Feature store operations: Feast or Tecton, point-in-time joins, online/offline consistency
Vector databases: Pgvector or Qdrant in production - index tuning, ANN search, embedding upsert pipelines
RAG data fundamentals: chunking strategies, embedding model selection, retrieval quality evaluation, and context freshness management
API transport: gRPC and HTTPS/mTLS for service-to-service communication; comfortable defining proto contracts and managing certificate lifecycle
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience

NICE TO HAVE

Differential privacy or k-anonymity for ML training datasets
Open source contributions: Feast, Great Expectations, Apache Beam, or dbt
Familiarity with IAM / access governance data: entitlements, provisioning events, access graphs
Iceberg or Delta Lake at petabyte scale

WHY JOIN SAVIYNT

Work on a large-scale, Kubernetes-based SaaS platform
Solve challenging cloud and reliability problems at scale
Collaborate with strong engineers in a reliability-focused culture
Competitive compensation, benefits, and growth opportunities

SECURITY & COMPLIANCE

This role requires adherence to Saviynt's information security and privacy policies, including annual security training.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

About Saviynt

Saviynt is a leading provider of cloud identity and access governance solutions. Saviynt enables enterprises to secure applications, data, and infrastructure in a single platform for Cloud (Office 365, AWS, Azure, Salesforce, Workday) and Enterprise (SAP, Oracle EBS). Saviynt is pioneering Identity 3.0 by integrating advanced risk analytics and intelligence with fine-grained privilege management. Top global brands leverage Saviynt technology. Saviynt is headquartered in Irvine, California with offices in Chicago, New York, Toronto, London, and Hyderabad, India.

Learn more about Saviynt

Size

500 employees

Industry

Information Technology

Founded

2010

* Ladders Estimates

Similar Jobs

Principal MLOps Engineer
$150K — $200K *
Raft Company Website
Remote
Reposted Yesterday
Principal Data Platform Engineer
$170K — $235K *
Doma Technology LLC
Remote
Yesterday
Principal MLOps Engineer
$150K — $200K *
Raft Company Website
Remote
Reposted 1 week ago
Principal Data Engineer, Growth Analytics
$128K — $229K *
Autodesk, Inc
Remote
1 week ago
Staff Engineer, Chemical Search
$180K — $200K *
Satomic
San Diego, CA 92154 (San Diego County)
1 week ago
Principal Research Engineer AEC Data - Generative AI, East Coast United States
$120K — $160K *
Autodesk, Inc
Remote
3 weeks ago

Get Ready For Your
Next Interview

More Jobs at Saviynt

Strategic Account Executive - Chicago
$170K — $180K *
Chicago, IL 60629 (Cook County)
2 days ago
Enterprise Technology
In-Person
Senior Solutions Engineer
$180K — $221K *
Remote
1 week ago
Enterprise Technology
Remote in California, US
Platform Support Engineer
$90K — $120K *
Atlanta, GA 30349 (Fulton County)
1 week ago
Enterprise Technology
Hybrid
Marketing Ops AI Agent Engineer
$90K — $130K *
Remote
2 weeks ago
Consumer Technology
Remote in United States
Identity Security - Sr. Customer Success Manager - West
$160K — $190K *
Remote
3 weeks ago
Information Technology
Remote in Seattle, WA

More Enterprise Technology Jobs

Director of Engineering - Product Engineering
$280K — $320K *
Zum
Redwood City, CA 94061 (San Mateo County)
Today
Account Executive, Growth Enterprise
$84K — $147K *
Genesys
Durham, NC 27713 (Durham County)
Today
Sr. GenAI Architect
$130K — $180K *
Hyundai Capital America
Irvine, CA 92620 (Orange County)
Today
Sr. Architect - Hybrid (Raleigh or Jersey City)
$144K — $223K *
Arch Capital Group Ltd.
Jersey City, NJ 07305 (Hudson County)
Today
Forward Deployed Engineer - Integration
$94K — $160K *
KLA Tencor
Ann Arbor, MI 48103 (Washtenaw County)
Reposted Today

Find similar Principal Software Engineer, AI Platform Engineering jobs:

Nationwide El Segundo, CA

Principal Software Engineer, AI Platform Engineering

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Principal Software Engineer, AI Platform Engineering jobs:

Get Ready For Your
Next Interview