HTC Global Services

Lead Site Reliability Engineer - Cloud Platform (GCP/Kubernetes)

HTC Global Services$130K — $160K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years in Site Reliability, Platform, Cloud Engineering, or DevOps
  • Expertise in Kubernetes management
  • Strong experience with Google Cloud Platform (GCP)
  • Proficiency in Infrastructure-as-Code practices using Terraform
  • Familiarity with Helm for Kubernetes deployments
  • Exposure to multi-cloud environments like AWS and Azure
  • Scripting skills in Python or Bash
  • Knowledge of monitoring tools like Prometheus, Grafana, Splunk, or OpenTelemetry

Responsibilities

  • Design and support robust cloud infrastructure in GCP
  • Architect and manage scalable Kubernetes environments
  • Build and maintain Infrastructure-as-Code with Terraform
  • Develop Helm charts and manage Kubernetes deployments
  • Design strategies for failover, disaster recovery, and multi-region operations
  • Enhance platform scalability, reliability, and performance
  • Implement best practices for monitoring, alerting, and observability
  • Mentor engineering teams and provide technical guidance

Benefits

  • Flexible work hours
  • Opportunities for professional development
  • Supportive team environment
  • Access to cutting-edge technology
  • Health and wellness programs
Full Job Description
Job Title: Lead Site Reliability Engineer (GCP & Kubernetes)

Overview / Summary

We are seeking a Lead Site Reliability Engineer to drive reliability, scalability, and operational excellence across a rapidly growing technology ecosystem. This role serves as a technical leader focused on cloud architecture, Kubernetes platforms, infrastructure automation, and highly available distributed systems. The position plays a key role in defining infrastructure strategy, improving platform resiliency, and mentoring engineering teams.

Key Responsibilities
• Design and support highly available cloud infrastructure in GCP
• Architect and manage Kubernetes environments at scale
• Build and maintain Infrastructure-as-Code using Terraform
• Develop and manage Helm charts and Kubernetes deployments
• Design failover, disaster recovery, and multi-region strategies
• Improve platform scalability, reliability, and performance
• Implement monitoring, alerting, and observability best practices
• Partner with engineering teams on platform architecture and cloud adoption
• Mentor engineers and provide technical leadership

Required Qualifications
• 7+ years of experience in Site Reliability Engineering, Platform Engineering, Cloud Engineering, or DevOps
• Expert-level Kubernetes experience
• Strong Google Cloud Platform (GCP) experience
• Expertise with Terraform
• Experience with Helm
• Multi-cloud exposure, including AWS and Azure
• Experience with distributed systems
• Python or Bash scripting experience
• Experience with Prometheus, Grafana, Splunk, or OpenTelemetry

#LI-Onsite #LI-DT1 #Hiring

About HTC Global Services

HTC Global Services is a global provider of IT and Business Process Services and Solutions. Founded in 1990, HTC is headquartered in Troy, Michigan with delivery centers across multiple locations in North America, Europe, India, and Malaysia. HTC is an Inc. 500 Hall of Fame company and has been recognized by numerous industry and trade publications as a top provider of services. HTC has a strong client base of Global 2000 customers. HTC has a strong focus on healthcare, retail, financial services, and automotive verticals. HTC has a strong commitment to corporate social responsibility and has been recognized for its contributions to the community.
Learn more about HTC Global Services
Size
17,575 employees
Industry
Founded
1990
NASDAQ

Similar Jobs

More Jobs at HTC Global Services

More Information Technology Jobs

Find similar Lead Site Reliability Engineer - Cloud Platform (GCP/Kubernetes) jobs: