Astronomer

Customer Reliability Engineer - Infrastructure

Astronomer$125K — $130K *
Technical Services
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5 years of experience in large, complex cloud infrastructures
  • 3 years of Kubernetes experience
  • Experience managing Production distributed systems with major cloud providers
  • Strong Linux background
  • Knowledge of operating and monitoring distributed systems
  • Proficient in resolving customer issues
  • Effective communication skills

Responsibilities

  • Provide solutions to ensure customer success with Astronomer products
  • Troubleshoot and triage customer environments during incidents
  • Participate in on-call rotation for weekend coverage
  • Supply feedback to product teams on customer pain points
  • Build and enhance monitoring and alerting systems
  • Develop automation for operational efficiency
  • Directly engage with customers to prioritize issues and provide support

Benefits

  • Comprehensive benefits package
  • Equity component as part of compensation
  • Fully distributed remote work environment
  • Exposure to diverse industries and cloud technologies
  • Opportunity to improve customer experience and product reliability
Full Job Description
About this role

The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers' usage of our managed Airflow service.

The CREs are responsible for operating, monitoring, and maintaining the platform to ensure availability, predictability, and reliable operations.

As an infrastructure specialist within the team, you will focus on the reliability of the underlying cloud infrastructure and Kubernetes clusters. This entails responding to incidents either raised by a customer, or from our monitoring system and then taking further steps to ensure problems are permanently resolved or monitored. As owners of the observability platform, CRE has unlimited potential to improve the reliability of the product and deliver the best possible outcome for our customers.

This role is directly customer-facing and gives exposure to very diverse problems and requirements. CRE get the opportunity to interface with customers from a variety of industries across different cloud providers, and all with different expectations. Your contributions will directly impact customers' success with using the Astronomer products, and you will be able to help make meaningful improvements to the customer experience.

What you get to do:
  • Provide solutions to customers to make them successful using our products.
  • Troubleshoot customer environments and engage in active triaging with customers
  • Participate in on-call rotation for weekend coverage
  • Provide feedback to the product development teams on customer needs and pain points.
  • Build out our monitoring and alerting systems.
  • Build and maintain automation to ensure daily operational tasks are handled as efficiently as possible.
  • Help direct the architecture of the products and contribute where possible.
  • Own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide "white glove" guidance on the path to production.
  • Participate remotely within a fully distributed team.
  • Enhance and enrich customer documentation
  • Work with the latest technology and multi-cloud implementations


What you bring to the role:
  • 5 years of experience, preferably with large, complex cloud infrastructures operating at scale
  • 3 years of experience with Kubernetes
  • Experience managing a Production distributed system with at least one major cloud provider (one or all: AWS, GCP, Azure)
  • Strong Linux experience
  • Knowledge of how to operate and monitor issues for distributed systems
  • Previous experience in handling customers issues (internal or external)
  • Strong communication skills
  • DevOps or CI/CD experience
  • Python scripting
  • Good troubleshooting Skills


Bonus points if you have:
  • Experience as a Site Reliability Engineer
  • Worked with Kubernetes Custom Resources
  • Depth of knowledge with Azure
  • Airflow/Big Data Orchestration experience
  • IaC experience


The estimated total compensation for this role ranges from $125,000 - $130,000 based on leveling and geography, along with an equity component and a comprehensive benefits package. This range is merely an estimate; actual compensation may deviate from this range based on skills, experience, and qualifications.

#LI-Fulltime

#LI-Remote

About Astronomer

Astronomer is a software company that provides a platform for data engineering, integration, and management. The company was founded in 2015 by Ry Walker and Tim Brunk, and is headquartered in Detroit, Michigan. Astronomer's platform allows businesses to collect, process, and analyze data from various sources, including cloud applications, databases, and APIs. The company's customers include Fortune 500 companies and startups in industries such as healthcare, finance, and e-commerce. Astronomer has raised over $23 million in funding from investors such as 8VC, Aspect Ventures, and Sierra Ventures.
Learn more about Astronomer
Size
50 employees
Industry
Founded
2014

Similar Jobs

More Jobs at Astronomer

More Technical Services Jobs

Find similar Customer Reliability Engineer - Infrastructure jobs: