As a Senior Reliability Engineer , You''ll :
- Empower engineers on other teams to take control of their services by writing shared infrastructure-as-code tooling and collaborating on internal best practices for infrastructure.
- Occasionally dive into the main Webflow application in React, Node, and MongoDB to better discern (and sometimes fix) behavior in production.
- Work with peers on Webflow’s Customer Support, Partnerships, and Sales teams to enable customers using Webflow’s services in production.
- Collaborate with members of Webflow’s Security & Compliance team to maintain a secure production environment, especially as we work towards SOC 2 Type 2 compliance in 2021.
- Continuously improve on-call and incident response processes as the company grows.
To succeed in the role, we’re looking for someone who has...
- Either a background as an ops engineer with an enthusiasm for code, or a background as a software engineer with an enthusiasm for systems administration.
- 5+ years of experience building, maintaining, and debugging distributed systems in a customer-facing environment that allows for little to no downtime.
- Experience navigating and scaling multi-tier cloud environments on either AWS or GCP.
- Experience with container-centric architectures, built with Docker and tools like Kubernetes (EKS, GKE, AKS, OpenShift, etc.), ECS, Docker Swarm, or Mesos.
- Experience with infrastructure-as-code tools like Terraform, Ansible, Puppet, or Chef.
- Experience or interest in contributing to full-stack applications built using React, Node, and MongoDB.
- Enthusiasm for mentoring and sponsoring less-experienced engineers.
It would be a bonus if you had even one of the following:
- Experience with Kubernetes, OpenResty, Terraform, or Pulumi, specifically.
- Experience improving on-call and incident response processes for Engineering.
- Experience working in high-compliance environments or a special interest in security engineering.