About the RoleWe are hiring an AWS Cloud Engineer to design, provision, optimize, and support the AWS infrastructure powering our AMD GPU AI/HPC platform. This is a hands-on execution role - you'll work closely with Rust backend engineers, TypeScript developers, SREs, and platform teams to keep cloud infrastructure reliable, cost-efficient, and scalable. The goal is simple: reduce cloud bottlenecks and give our engineering teams a solid foundation to build on.
What You'll Do- Own the full lifecycle of AWS infrastructure across dev, staging, production, and customer-facing environments - provisioning, scaling, monitoring, security, cost optimization, and decommissioning
- Build and maintain Infrastructure-as-Code (Terraform, Pulumi, AWS CDK, CloudFormation)
- Implement cloud patterns for high availability, auto-scaling, secure service communication, and customer environment provisioning
- Build and maintain CI/CD workflows for cloud infrastructure and hosted services
- Improve observability through metrics, logging, alerting, dashboards, and runbooks
- Troubleshoot AWS networking, compute, storage, IAM, and deployment issues
- Participate in incident response, post-incident reviews, and root cause analysis
- Document architecture, operational processes, and best practices
Who You AreRequired Qualifications- 5+ years in cloud infrastructure, DevOps, SRE, or platform operations
- Hands-on AWS experience: VPCs, EC2, S3, IAM, CloudWatch, Route 53, load balancers, security groups, private networking
- Proficiency with IaC tooling (Terraform strongly preferred)
- Strong Linux fundamentals - networking, process management, storage, troubleshooting
- Experience with CI/CD, Git-based workflows, and monitoring/alerting platforms
- Clear communicator who can document infrastructure and collaborate across engineering teams
Preferred Qualifications- Experience with AI/ML, GPU, or HPC workloads
- Kubernetes on AWS (EKS or self-managed)
- Observability platforms: Prometheus, Grafana, Loki, OpenTelemetry, Datadog
- AWS cost optimization: right-sizing, savings plans, lifecycle policies, tagging
- Startup or high-growth infrastructure environment background
What We Offer- 100% paid Medical, Dental, and Vision insurance for Employees
- Company Health Savings Account Contributions
- 100% paid Short Term and Long Term Disability Insurance for Employees
- Life and Voluntary Supplemental Insurance Options
- Other Insurance Options, such as Pet & Legal Insurance
- Various Supplementary Health Benefits, such as discounted Virtual Healthcare Appointments and Serious Illness Support
- Flexible Spending Account
- Employee Assistance Program