Data Reliability Engineer

Empower$87K — $123K *
US-AnywhereRemote in United States
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's in Computer Science, Information Systems, or related field; equivalent practical experience accepted.
  • 5-8+ years in data engineering/analytics roles with 3+ years of Snowflake in production.
  • Proven experience in building contract-first pipelines and automating tests at scale.
  • Strong SQL skills and depth in Snowflake features (warehouses, Streams/Tasks, Snowpipe).
  • Proficient with Terraform, dbt, and DevOps tools (GitHub/GitLab/Azure DevOps).
  • Experience in data quality and observability platforms, such as GX or Soda.

Responsibilities

  • Define and enforce the DataOps lifecycle, ensuring data contracts for products.
  • Manage CI/CD processes, including Snowflake object automation and environment promotion.
  • Engineer reliable data pipelines using orchestration tools like Airflow or Lambda.
  • Implement comprehensive data quality tests and monitoring solutions.
  • Automate data governance processes including PII classification and tagging.
  • Lead incident management for data-related issues, streamlining incident triage and communications.
  • Analyze data usage and implement FinOps practices to optimize performance and costs.

Benefits

  • Medical, dental, vision, and life insurance coverage.
  • 401(k) plan with up to 6% company matching contributions.
  • Tuition reimbursement up to $5,250 per year.
  • Generous paid time off, including ten company holidays and floating holidays.
  • Paid volunteer time of 16 hours per year for community engagement.
  • Inclusive Business Resource Groups to foster collaborative environments.
Full Job Description
The Data Reliability Engineer will own the reliability, stability, and operational excellence of an AWS-based data platform. This role will operate, troubleshoot, and improve production data systems to ensure data pipelines and analytics platforms are resilient, performant, and meet business-critical SLAs. The Data Reliability Engineer will work closely with data and platform engineering teams to diagnose issues, resolve production incidents, and improve design and operational practices across the data ecosystem.

What you will do:
  • Own the reliability and stability of production data pipelines and data platform services.
  • Define, improve, and enforce data SLAs/SLOs for batch and streaming products, including freshness, latency, and completeness.
  • Diagnose and resolve data pipeline failures, delays, and data quality issues in production environments.
  • Investigate issues across distributed data systems, including Spark/EMR workloads, ingestion pipelines, and warehouse performance.
  • Lead or support incident response, including triage, mitigation, and long-term resolution.
  • Perform root cause analysis and implement durable fixes to prevent recurrence.
  • Design and enhance monitoring, alerting, and observability for data systems.
  • Develop automation and tooling to reduce operational toil and improve system resilience.
  • Contribute to disaster recovery and resiliency planning, including backup validation and recovery workflows.
  • Partner with engineering teams to improve pipeline design, reliability, and operational readiness.
  • Create and maintain runbooks, Standard Operating Procedures, and operational documentation.
  • Participate in occasional off-hours support for production data systems when required.


What you will bring:
  • Bachelor's degree in Computer Science, Information Systems, Data Science, or a related field.
  • 5+ years of experience in data engineering or analytics platform roles, including 3+ years operating in a production cloud data warehouse environment such as Redshift or Snowflake.
  • 3+ years of experience building AWS data pipelines and supporting them through production, including exposure to real-world failures and operational challenges.
  • 3+ years of experience working with production data platforms in AWS environments, with a focus on anomaly detection, reconciliation, and end-to-end validation.
  • 3+ years of experience with Python and SQL in real data systems.
  • Hands-on experience troubleshooting distributed data processing systems such as Spark/EMR, Redshift, and streaming systems.
  • Proven ability to debug and resolve production issues in data pipelines and data platforms.
  • Experience with AWS data services such as EMR, Redshift, DynamoDB, S3, or similar.
  • Proven ability to handle production incidents and perform root cause analysis.
  • Strong problem-solving mindset and ability to work through ambiguous production issues.


What will set you apart:
  • Experience handling real-world data issues such as pipeline delays or failures.
  • Experience with data backfills and reprocessing.
  • Experience with late-arriving data or incomplete datasets.
  • Experience improving observability and alerting specifically for data systems.
  • Experience influencing or guiding data pipeline reliability and operational practices.
  • Exposure to streaming or event-driven systems such as Kafka, Kinesis, and CDC patterns.
  • Experience with disaster recovery, backup validation, and resiliency testing.
  • Strong communication during incidents with both technical and non-technical stakeholders.
  • Prior FinOps or capacity-planning ownership for data platforms.
  • Familiarity with BI semantic layers and contract enforcement at consumption, including Looker, Power BI, or Tableau.


This job operates in a professional office environment.

This job description is not intended to be an exhaustive list of all duties, responsibilities and qualifications of the job. The employer has the right to revise this job description at any time. You will be evaluated in part based on your performance of the responsibilities and/or tasks listed in this job description. You may be required to perform other duties that are not included on this job description. The job description is not a contract for employment, and either you or the employer may terminate employment at any time, for any reason, as per terms and conditions of your employment contract.

What we offer you

We offer an array of diverse and inclusive benefits regardless of where you are in your career. We believe that providing our employees with the means to lead healthy balanced lives results in the best possible work performance.
  • Medical, dental, vision and life insurance
  • Retirement savings - 401(k) plan with generous company matching contributions (up to 6%), financial advisory services, potential company discretionary contribution, and a broad investment lineup
  • Tuition reimbursement up to $5,250/year
  • Business-casual environment that includes the option to wear jeans
  • Generous paid time off upon hire - including a paid time off program plus ten paid company holidays and three floating holidays each calendar year
  • Paid volunteer time - 16 hours per calendar year
  • Leave of absence programs - including paid parental leave, paid short- and long-term disability, and Family and Medical Leave (FMLA)
  • Business Resource Groups (BRGs) - BRGs facilitate inclusion and collaboration across our business internally and throughout the communities where we live, work and play. BRGs are open to all.


Base Salary Range
$87,400.00 - $123,400.00

The salary range above shows the typical minimum to maximum base salary range for this position in the location listed. Non-sales positions have the opportunity to participate in a bonus program. Sales positions are eligible for sales incentives, and in some instances a bonus plan, whereby total compensation may far exceed base salary depending on individual performance. Actual compensation offered may vary from posted hiring range based upon geographic location, work experience, education, licensure requirements and/or skill level and will be finalized at the time of offer.

About Empower

Empower is a retirement plan recordkeeping financial holding company based in Greenwood Village, Colorado, United States. It is the second-largest retirement plan provider in the United States.
Learn more about Empower

Similar Jobs

More Jobs at Empower

More Information Technology Jobs

Find similar Data Reliability Engineer jobs: