Senior Site Reliability & Platform EngineerWe are seeking a
Senior Site Reliability & Platform Engineer who views infrastructure as code and security as a baseline requirement. You will be a key architect in defining our shared responsibility model, ensuring that while we provide the platform, the platform provides the guardrails. In this role, you will be a systems thinker who understands that IT is an enabler, focusing on building robust platforms rather than performing arbitrary third-party integrations.
Day in the Life- Platform Engineering: Design and maintain our core Kubernetes and Cloud Native environments within GCP, AWS, and Azure, ensuring high availability, scalability, security, and seamless deployment patterns.
- Observability & Reliability: Implement a comprehensive observability stack to provide deep insights into system health, performance, and security posture.
- Cross-Cloud Strategy: While GCP is our primary home, you will provide expertise in integrating and bridging legacy or specialized workloads in Azure and AWS.
- Automation & Lifecycle: Build automated, repeatable processes for provisioning and deprovisioning infrastructure, reducing manual toil to near zero.
- The "Rails" Philosophy: Develop self-service tools that empower DevOps and Engineering teams to manage their own tool configurations while remaining compliant with MergeCo security standards.
Who You Are- A Systems Thinker: You understand that IT is an enabler. You focus on building robust platforms rather than performing arbitrary third-party integrations.
- Kubernetes Expert: You have deep experience managing production-grade clusters (GKE preferred) and understand the intricacies of service meshes, networking, and container security.
- Cloud Polyglot: GCP is your native tongue, but you are fluent enough in Azure and AWS to navigate complex multi-cloud environments.
- Security-First Mindset: You treat security as a core feature of reliability, not an afterthought.
- Collaborative Partner: You prefer "Partnership" over "Gatekeeping," working with business units to define where the platform ends and their application
Must Haves- 5+ years in SRE, DevOps, or Platform Engineering roles.
- Expert-level Kubernetes orchestration and containerization (Docker/Containerd).
- GCP Professional Cloud Architect or equivalent experience (IAM, VPCs, GKE, Cloud Operations).
- IaC Mastery: Deep proficiency in Terraform, CDK, or Pulumi
- Observability: Experience with Prometheus, Grafana, ELK, or Datadog to drive SLIs/SLOs.
- Familiarity with Azure/AWS for hybrid-cloud connectivity and migrations.
- Scripting/Coding: Proficiency in Go, Python, or similar for tooling and automation.
Nice to Haves- Cloud Polyglot: Familiarity with Azure/AWS for hybrid-cloud connectivity and migrations.
- Observability Tooling: Experience with Prometheus, Grafana, ELK, or Datadog to drive SLIs/SLOs.
- Experience navigating complex multi-cloud environments.
A Few of the Perks- Competitive benefits
- Unlimited PTO
- Remote work available for U.S.-based candidates
- 401(k) with employer match
- Paid parental leave
- In-office benefits for those local to Dallas, TX:
- Catered lunches
- Casual office atmosphere & located in the Design District
- Fully stocked kitchen
The pay range for this role is:
130,000 - 170,000 USD per year (Remote (United States))