Principal Site Reliability Engineer

Varda Space Industries

• $153K — $185K *

El Segundo, CA 90245In-Person

Aerospace & Defense

8 - 10 years of experience

Reposted 1 week ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree in computer science, information systems, or engineering; experience in place of a degree acceptable.
Professional experience in building and troubleshooting IT systems.
Strong verbal and written communication skills.
Ability to adapt in a fast-paced, detail-oriented environment.
Experience with Linux and Windows servers in enterprise-level settings.
Familiarity with on-premises and cloud networking technologies.
Scripting skills in Python, Bash, PowerShell, or similar languages.

Responsibilities

Lead deployment, maintenance, and operations of mission-critical applications and IT infrastructure.
Design and maintain Infrastructure as Code (IaC) frameworks using Terraform and Ansible.
Architect and manage cloud services ensuring compliance with security standards.
Oversee migration and modernization of enterprise software into scalable platforms.
Implement GitOps practices for secure automated deployments to Kubernetes clusters.
Collaborate with engineers to ensure reliable and scalable systems and pipelines.
Identify and resolve system bottlenecks and reliability risks through tuning initiatives.
Participate in on-call rotation to provide technical leadership during incidents.

Benefits

Work alongside a team of professionals at the forefront of the space industry.
Equity opportunities in a well-funded startup with significant growth potential.
401(k) matching contributions to support your future.
Unlimited paid time off for work-life balance.
Comprehensive health insurance, including vision and dental.
Daily lunch and snacks provided, with dinners twice a week.
Generous maternity and paternity leave.

Full Job Description

About This Role

As a Principal Site Reliability Engineer, you will help set the technical vision and strategy for reliability across spacecraft, ground systems, and enterprise platforms. You'll define standards, mentor senior engineers, and drive cross-organizational initiatives to ensure systems are highly operable, secure, and mission-ready. This role combines deep technical expertise with the ability to influence architectural direction at the company level.

Responsibilities

Lead and contribute hands-on to the deployment, maintenance, and operations of mission-critical applications and infrastructure supporting spacecraft, ground systems, and company-wide platforms.

Design, execute, and manage highly scalable, reliable, and operable software and infrastructure platforms, applying Infrastructure as Code (IaC) principles to drive automation, consistency, and repeatability across Kubernetes environments.

Collaborate closely with software and hardware teams to align reliability best practices, CI/CD pipelines, and compliance with their workflows, enabling faster, more secure deployments for mission-critical systems.

Anticipate and address reliability risks, capacity challenges, and performance bottlenecks; develop long-term strategies in partnership with leadership.

Rotate through the team's on-call schedule to keep critical systems healthy and responsive.

Occasionally travel to customer sites and other Varda locations to troubleshoot, deploy, or test critical infrastructure.

Basic Qualifications

10+ years of experience in SRE, DevOps, or systems engineering, including leadership of large-scale, mission-critical systems.

Experience leading technical direction and architecture for large-scale systems

Hands-on experience with observability stacks and telemetry pipelines-including metrics collection, alerting, and dashboards-for Linux systems and Kubernetes workloads (e.g., Prometheus and Grafana).

Strong background in systems architecture and software-defined networking (VPC, subnets, firewalls, VPNs, etc.).

Proficiency in automation and scripting with Python, Bash, or similar languages

Positive and strong communication skills, both written and oral

Preferred Skills and Experience

Expertise in time-series databases (e.g., InfluxDB) for large-scale telemetry pipeline.

Expertise in provisioning and managing scalable Azure cloud infrastructure using native tools and best practices (Azure GCC High preferred).

Experience with IaC tools like Terraform, and Ansible and CI/CD systems like Git and ArgoCD

Experience building and maintaining dynamic system configurations with templating frameworks such as YAML, and Helm.
Strong understanding of Linux systems, containerization technologies, and Kubernetes internals

Pay Range

Senior Site Reliability Engineer: 153,000.00 - $185,00.00/per year
This role is on-sitein El Segundo, CA
Leveling and base salary is determined by job-related skills, education level, experience level, and job performance
You will be eligible for long-term incentives in the form of stock options and/or long-term cash awards

Benefits

Exciting team of professionals at the top of their field working by your side
Equity in a fully funded space startup with potential for significant growth (interns excluded)
401(k) matching (interns excluded)
Unlimited PTO (interns excluded)
Health insurance, including Vision and Dental
Lunch and snacks provided on site every day. Dinners provided twice a week.
Maternity / Paternity leave (interns excluded)

* Ladders Estimates

Similar Jobs

Sr. ATLO (Assembly, Test, & Launch Operations) Engineer - Millennium Space Systems
$119K — $215K *
Boeing
El Segundo, CA 90245 (Los Angeles County)
Reposted Today
Enterprise Modeling Staff Analyst
$110K — $165K *
The Aerospace Corporation
El Segundo, CA 90245 (Los Angeles County)
Reposted Today
Senior Engineering Specialist - Digital Systems Engineering
$151K — $226K *
The Aerospace Corporation
El Segundo, CA 90245 (Los Angeles County)
Reposted Today
Staff Site Reliability Engineer
$119K — $170K *
Zscaler
Remote
Reposted Today
Kubernetes Platform Engineer
$102K — $154K *
AIS
Remote
Today
Ontology Systems Engineer
$157K — $174K *
General Dynamics
Remote
Today

Get Ready For Your
Next Interview

More Jobs at Varda Space Industries

Spacecraft Ground Systems Software Engineer
$125K — $155K *
El Segundo, CA 90245 (Los Angeles County)
Today
Aerospace & Defense
In-Person
Senior Spacecraft Ground Systems Software Engineer
$169K — $217K *
El Segundo, CA 90245 (Los Angeles County)
Today
Aerospace & Defense
In-Person
Spacecraft Flight Software Engineer
$120K — $160K *
El Segundo, CA 90245 (Los Angeles County)
2 days ago
Aerospace & Defense
In-Person
Build Reliability Engineer
$109K — $145K *
El Segundo, CA 90245 (Los Angeles County)
3 days ago
Aerospace & Defense
In-Person
Supply Chain Engineer- Mechanical
$119K — $232K *
El Segundo, CA 90245 (Los Angeles County)
4 days ago
Aerospace & Defense
In-Person

More Aerospace & Defense Jobs

Model Based Systems Engineer
$130K — $150K + paid health insurance & dependents, paid education assistance, *
Kitty Hawk Technologies
King George, VA 22485 (King George County)
Yesterday
Configuration Manager, Engineering Operations
$141K — $251K *
Thales Group
Irvine, CA 92620 (Orange County)
Today
Principal Electrical Engineer - SlingWorks
$100K — $130K *
Kollsman
Merrimack, NH 03054 (Hillsborough County)
Reposted Today
Associate, System Integration and Test Engineer
$76K — $141K *
Level 3 Communications, Inc
San Diego, CA 92154 (San Diego County)
Today
Lead, Trade Compliance
$100K — $130K *
Level 3 Communications, Inc
Salt Lake City, UT 84118 (Salt Lake County)
Today

Find similar Principal Site Reliability Engineer jobs:

Nationwide El Segundo, CA

Principal Site Reliability Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Principal Site Reliability Engineer jobs:

Get Ready For Your
Next Interview