About the RoleWe are looking for a Site Reliability Engineer to help design, build, and operate the platforms that power AI Co-Workers. This is a hands-on role for an engineer who enjoys owning reliability end-to-end and working closely with product, AI, and engineering teams.
Responsibilities- Design, build, and operate reliable production infrastructure supporting AI Co-Workers
- Own Kubernetes-based platforms used to deploy and run AI workloads
- Build and maintain infrastructure as code using Terraform
- Implement and maintain Helm-based deployment workflows
- Define, measure, and improve system reliability using SLIs, SLOs, and SLAs
- Participate in on-call rotation, incident response, root cause analysis, and post-mortems
- Reduce operational toil through automation and engineering improvements
- Build and improve observability across monitoring, logging, and alerting
- Partner closely with engineers to ensure systems are resilient, scalable, and secure
- Operate across build, deploy, and operate phases of the software lifecycle
Minimum Qualifications- 5+ years of relevant work experience as a Site Reliability Engineer, DevOps Engineer or similar role
- Hands on Kubernetes experience designing, building or operating workloads on EKS, AKS, GKE or self-managed Kubernetes
- Terraform experience for infrastructure provisioning and automation
- Experience with Help for Kubernetes application deployment
- Hands on experience working with at least one major cloud provide such as AWS, Azure or Google Cloud
- Experience with at least two programming or scripting languages such as Python, Go, Java, Bash, Ruby or PowerShell
- Experience in reliability engineering, on-call rotations, incident response, post-mortems and toil reduction
Preferred Qualifications- Experience working within a defined SDLC, including CI/CD, release processes, and end-to-end delivery from design to operations
- Experience with ArgoCD or GitOps-style deployment approaches
- DevOps or DevSecOps experience, including CI/CD ownership, infrastructure automation, and security considerations
- Relevant certifications such as CKA, CKAD, cloud certifications, DevOps, DevSecOps, or programming credentials
Why Join Us?- A high-performance culture
- State-of-the-art technology
- Experience world-class leadership
- Scale of impact and purpose
- A competitive salary and a huge growth trajectory
- Work with the best in the industry
- Flexible work environment
- Diversity and creativity
Disclaimer: We do not wish to be contacted by recruitment agencies. Our hiring process is managed in-house and the best way for candidates to express interest is by applying with your resume through our company website.