Airbyte

Senior Site Reliability Engineer - Hiring Sprint

Airbyte$130K — $180K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years in infrastructure, platform engineering, SRE, or DevOps.
  • Experience with Kubernetes, Helm, and Terraform in production environments.
  • Proficient in observability tools such as Prometheus, Grafana, and Datadog.
  • Hands-on experience with CI/CD pipelines and developer tooling.
  • Ability to analyze backend code to troubleshoot systems effectively.
  • Familiarity with AI tools, specifically LLMs for automation and debugging.
  • Adaptable to a startup environment with a fast-paced, problem-solving mindset.

Responsibilities

  • Own the infrastructure for the Data Replication platform including Kubernetes clusters and CI/CD pipelines.
  • Collaborate with product engineers for seamless integration of features and infrastructure.
  • Enhance observability and alerting systems with a focus on AI automation.
  • Lead the development of AI-augmented release processes including canary deployments and rollback automation.
  • Set and uphold high infrastructure standards, creating self-serve tooling and documentation.

Benefits

  • Flexible PTO encouraging a minimum of 25 days off annually.
  • 16 weeks of fully paid parental leave for all parents.
  • Comprehensive medical, dental, and vision coverage for employees and their dependents.
  • 401(k) retirement plan.
  • Budget for professional development including conference sponsorship.
  • Commuter benefits and monthly internet reimbursement.
  • Complimentary breakfast and lunch in the San Francisco office.
Full Job Description
Engineering Hiring Sprint:

We're growing our engineering team and are accelerating hiring through a focused Engineering Hiring Sprint. Rather than stretching interviews over several weeks, we're bringing exceptional candidates through an expedited process and making hiring decisions quickly.

Interview process:
  1. Apply
  2. Technical Take-Home (Java or Python)
  3. Hiring Manager Interview
  4. In-Person Onsite (the week of July 20)
  5. Hiring decision by the end of the week


We're hiring across multiple engineering teams, including:
  • Platform Engineers
  • 🗄 Database Engineers
  • ☁ Site Reliability Engineers
  • 🔌 Extensibility API Engineers
  • AI Agents Engineers
  • Engineering Managers


If you enjoy solving complex technical problems, moving quickly, embracing AI, and taking ownership of your work, we'd love to meet you.

The Role:

You'll be the infrastructure and reliability engineer on the Data Replication team - a full-stack product team running over 3 million sync jobs a week powering thousands of data use cases across multiple regions and clouds. You'll build and maintain the infrastructure, set reliability standards, drive down incidents, and make it easier and safer for engineers to ship through tooling. You're equally comfortable in a Terraform file, a Kubernetes cluster, and a postmortem doc.

We expect engineers here to actively use AI as a force multiplier - agentic tools to automate toil, augment incident response, and build smarter internal tooling. If you're not already doing this, you should be excited to start. We care as much about how you work as what you build. Trust, directness, and craftsmanship matter here.

What You'll Do:
  • Own the infrastructure underpinning the Data Replication platform - Kubernetes clusters, CI/CD pipelines, secrets management, networking, and cloud resource configuration across AWS and GCP.
  • Partner with product engineers to reliably integrate product features with infrastructure.
  • Maintain and enhance observability, alerting, and anomaly detection with an eye towards LLM automation.
  • Maintain and enhance AI-augmented release and internal tooling: canary deployments, progressive rollouts, automated release qualification, and rollback automation - with an eye towards LLM automation.
  • Set the infrastructure bar for the team - build self-serve tooling, write runbooks, and coach engineers to own more of their stack.


What You'll Need:
  • 7+ years in infrastructure, platform engineering, SRE, or DevOps.
  • Hands-on ownership of Kubernetes, Helm, and Terraform in production environments.
  • Deep experience with observability stacks (Prometheus, Grafana, Datadog) and on-call operations.
  • Experience with CI/CD pipeline ownership and developer tooling.
  • Ability & willingness to read backend code to understand how systems break and instrument them correctly.
  • Fluency with AI tools - LLMs and agentic frameworks to automate, debug faster, and reduce toil.
  • A startup-ready mindset: comfortable with ambiguity, moving fast, and owning problems end-to-end.


Nice To Have:
  • Data pipelines, replication systems, or ETL/ELT platforms.
  • Control plane / data plane architectures or internal developer platforms.
  • Experience with Airbyte, CDKs, or connector-based architectures.


Location:
  • Onsite 4 days/week in San Francisco, CA


Why You'll Love Working at Airbyte:

At Airbyte, we believe great work happens when people feel supported, trusted, and empowered to grow. Our market-leading Total Rewards package is designed to help you thrive professionally and personally. Our benefits and perks include:
  • Flexible PTO with a culture that encourages at least 25 days off annually
  • 16 weeks fully paid parental leave for all parents
  • Comprehensive medical, dental, and vision coverage for employees and dependents
  • 401(k) retirement plan
  • Professional development budget, conference sponsorship, and book reimbursement
  • Commuter benefits and monthly internet reimbursement
  • Breakfast and lunch in our San Francisco office
  • A collaborative, in-person culture focused on learning, growth, and impact


If you find this role exciting, we encourage you to apply even if you think you don't meet all of the requirements!

We are not accepting agency submissions or recruiting firm support for this role. Unsolicited resumes will not be considered.

About Airbyte

Airbyte is an open-source data integration platform that helps organizations replicate data from applications, APIs, and databases to data warehouses, lakes, and other destinations. The company was founded in 2020 by Michel Tricot and John Lafleur and is headquartered in San Francisco, CA. Airbyte's platform is designed to be easy to use, scalable, and extensible, and it offers a range of features including data transformations, scheduling, monitoring, and more. The company has raised $5.2 million in seed funding to date, and its investors include Accel, Y Combinator, and 8VC.
Learn more about Airbyte
Size
20 employees
Industry
Founded
2020
NASDAQ

Similar Jobs

More Information Technology Jobs

Find similar Senior Site Reliability Engineer - Hiring Sprint jobs: