Principal Cloud Engineer - Infrastructure (Automation & BCDR)

ForeFlight

• $208K — $244K *

Austin, TX 78745In-Person

Information Technology

11 - 15 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

12+ years of engineering experience, with 7+ years as primary architect for infrastructure automation and resilience programs.
Expert in infrastructure as code (IaC) with deep knowledge of Terraform or equivalent tools.
Strong command of CI/CD processes for infrastructure with experience in pipeline design and drift detection.
Proven track record in end-to-end BCDR strategy including setting RTO/RPO targets and real DR exercises.
Hands-on chaos engineering experience in production environments.
Deep observability expertise, focusing on SLOs and infrastructure visibility across services.
Experience working in major cloud environments, particularly AWS, with some breadth across Azure.

Responsibilities

Own and enhance infrastructure automation platforms and CI/CD pipelines.
Lead design and validation of Business Continuity and Disaster Recovery (BCDR) strategies.
Build and operate observability and resilience tools to monitor infrastructure state.
Define and enforce infrastructure as code (IaC) standards across cloud environments.
Establish service level objectives (SLOs) for core infrastructure services and improve incident response.
Translate business continuity requirements from various stakeholders into actionable infrastructure designs.
Manage coordination and unification of divergent automation stacks in a complex organizational context.

Benefits

Medical, dental, vision insurance with employer paid health premiums.
Open paid time off (PTO) policy.
401(k) plan with up to 10% company matching and immediate vesting.
12 weeks of paid parental leave.
Flight training rewards.

Full Job Description

Principal Cloud Engineer - Infrastructure (Automation
& BCDR)

The role is preferred as Hybrid working in Austin TX, Houston TX or Denver CO, we will consider virtual for the right candidate.

Key Responsibilities:
• Own and evolve infrastructure automation platforms; CI/CD pipelines for infrastructure, self-
service provisioning workflows, serving engineering teams across a distributed, multi-region
environment
• Lead the design and continuous validation of Business Continuity and Disaster Recovery
strategy, including RTO/RPO target-setting, failover design, chaos engineering, and
recovery runbook ownership
• Build and operate observability and resilience tooling to ensure infrastructure state is fully
instrumented, drift is detected proactively, and failure scenarios are exercised before they're
encountered in production
• Define and govern IaC standards (Terraform, CDK, or equivalent), including module
strategy, state management, and guardrail enforcement across cloud accounts and
environments
• Own platform reliability outcomes, establish SLOs for core infrastructure services, drive
down toil through systematic automation, and maintain high standards for incident response
quality
• Operate effectively across a complex organizational context, translating business continuity
requirements from engineering, security, and compliance stakeholders into concrete
infrastructure design and validated recovery capability

Basic Qualifications:
• 12+ years of engineering experience, with at least 7 as primary architect or technical owner
of infrastructure automation platforms and resilience programs at scale
• Deep production experience designing and operating IaC at scale: Terraform (or
CDK/Pulumi equivalent), with strong opinions on module strategy, state management,
policy-as-code, and guardrail enforcement across many cloud accounts and environments
• Expert command of CI/CD for infrastructure: pipeline design, drift detection, plan/apply
workflows, secrets handling, and self-service patterns that serve engineering teams safely at
scale
• Track record owning Business Continuity and Disaster Recovery strategy end-to-end:
setting RTO/RPO targets, designing multi-region failover, running real DR exercises, and
translating findings into durable architectural change
• Hands-on experience with chaos engineering and resilience testing in production
environments, including failure-injection tooling and game-day operations
• Strong grounding in observability for infrastructure: SLOs, drift detection, state-of-the-fleet
visibility, and instrumenting both control-plane and data-plane signals
• Deep production experience in at least one major cloud (AWS preferred), with credible
breadth across both AWS and Azure or strong evidence you can become productive across
both
• Cross-functional leadership, comfortable as a peer with senior security, compliance, finance,
and product engineering leaders on business continuity and audit-readiness conversations
• Comfortable with the coordination work of a recently combined company: divergent
automation stacks, in-flight unification, and the political work that comes with consolidation

Preferred Qualifications:
• Experience leading a BCDR program through external audit or regulatory review (SOC 2,
FedRAMP, ISO 22301, financial-services resilience frameworks, or aviation-relevant
equivalents)
• Experience standing up or evolving a self-service infrastructure platform (Backstage, internal
developer portal, or equivalent) with golden-path provisioning patterns
• Hands-on experience with infrastructure orchestration tooling beyond raw Terraform
(Terragrunt, Atlantis, Spacelift, env0, Crossplane, or similar)
• Experience with chaos engineering tooling (AWS FIS, Azure Chaos Studio, Gremlin, Chaos
Mesh, Litmus) in production
• Experience designing and operating cross-region or cross-cloud disaster recovery for
stateful workloads (databases, message queues, object stores)
• Background in SRE or platform reliability with strong instincts for SLO design, error budget
policy, and toil reduction
• Experience post-M&A integrating infrastructure automation platforms across two or more
legacy stacks
• Experience in aviation, regulated industries, or other domains with mission-critical workloads
and strict business continuity requirements
• Background contributing to or evaluating resilience standards and frameworks (ISO 22301,
NIST SP 800-34, or industry equivalents)

Medical, dental, vision insurance with Employer paid health premiums
Open PTO Policy
401(k) with up to 10% company matching and immediate vesting
12 Weeks Paid Parent Leave
Flight Training Rewards

Pay is based upon candidate experience and qualifications, as well market and business considerations: Summary Pay Range: $208,000.00-$244,000.00

* Ladders Estimates

Similar Jobs

Infrastructure/Cloud Architecture
$143K — $273K *
Kyndryl Holdings, Inc.
Dallas, TX 75217 (Dallas County)
Today
Cloud / Infrastructure Engineer
$160K — $210K *
Qualified Health
Remote
Yesterday
Enterprise Classified Cloud Platform Enablement Leader
$132K — $251K *
Raytheon Technologies
Richardson, TX 75080 (Dallas County)
Yesterday
Director, Infrastructure & Cloud Engineering
$147K — $246K *
McKesson
Irving, TX 75061 (Dallas County)
2 days ago
OpenShift Senior Consultant - TS/SCI + Poly Required
$130K — $215K *
Red Hat
Remote
2 days ago
Principal Engineer, Cloud Data Infrastructure
$198K — $294K *
PayPal
Austin, TX 78745 (Travis County)
2 days ago

Get Ready For Your
Next Interview

More Jobs at ForeFlight

Principal Cloud Engineer - Infrastructure (Automation & BCDR)
$208K — $244K *
Austin, TX 78745 (Travis County)
Today
Information Technology
In-Person
Principal Engineer, Engineering Core Services
$208K — $244K *
Austin, TX 78745 (Travis County)
Today
Enterprise Technology
In-Person
Sr Project Manager
$90K — $120K *
Montreal, QC H1A 0A1
Reposted Today
Aerospace & Defense
In-Person
Customer Solution Owner
$90K — $120K *
Montreal, QC H1A 0A1
Reposted Today
Business Services
In-Person
Account Executive, Government and Military (UK and Northern Europe)
$80K — $120K *
Austin, TX 78745 (Travis County)
3 days ago
Aerospace & Defense
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
Yesterday
Client Partner - Banking / Financial Services / Capital Markets
$325K — $350K + $100K bonus *
Large IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
2 weeks ago
Sr. Digital Engineer
$100K — $115K *
Duluth Trading Co
Remote
Today
Sr. DevSecOps Architect
$151K — $161K *
Edgewater Federal Solutions, Inc.
Washington, DC 20011 (District Of Columbia County)
Today
Project Manager 2
$80K — $110K *
Edgewater Federal Solutions, Inc.
Los Alamos, NM 87544 (Los Alamos County)
Today

Find similar Principal Cloud Engineer - Infrastructure (Automation & BCDR) jobs:

Nationwide Austin, TX

Principal Cloud Engineer - Infrastructure (Automation & BCDR)

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Principal Cloud Engineer - Infrastructure (Automation & BCDR) jobs:

Get Ready For Your
Next Interview