Gem.com

Senior Site Reliability Engineer

Gem.com$100K — $140K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years of experience with cloud services in high-volume environments.
  • Bachelor's degree in Computer Science or related field.
  • Extensive Linux system knowledge with a focus on security and networking.
  • Proficient in Python, Go, or Bash programming languages.
  • Experience with monitoring tools: Prometheus, Grafana, ELK stack.
  • Skilled in configuration management and cloud deployment technologies (Ansible, Terraform, Kubernetes).
  • Problem-solving mindset in solution design and implementation.

Responsibilities

  • Build automated systems for managing cloud infrastructure.
  • Develop frameworks for deploying and upgrading applications.
  • Ensure compliance with security protocols (ISO 27001, SOX, PCI).
  • Enhance system observability and define service level objectives.
  • Collaborate with cross-functional teams to create reliable systems.

Benefits

  • Comprehensive healthcare coverage.
  • Generous paid time off policy.
  • Equity participation in the company.
  • Flexibility of a remote work environment.
  • Structured in-office collaboration days for team building.
Full Job Description
The Opportunity

This is a high-ownership role with direct influence over infrastructure decisions. The team has a clear roadmap focused on improving reliability, security posture, and operational maturity. The Senior Site Reliability Engineer helps build first-class infrastructure to deliver our best-in-class technology to the world. The infrastructure is large and complex, running in the cloud and on Kubernetes, so there's no shortage of interesting problems.

What You'll Do
  • Build software and systems for cloud infrastructure management and automation (Terraform, Ansible, Oracle Cloud, GCP)
  • Participate in developing frameworks for application deployment, customization, and upgrades (Kubernetes, ArgoCD, Vault, Jenkins)
  • Ensure application and infrastructure security complies with ISO 27001 / SOX / PCI
  • Improve observability, implement and measure key metrics, and define and enforce SLOs/SLAs (Prometheus, Grafana, ELK)
  • Collaborate with engineering, quality engineering, and product management to architect and build highly available, reliable, and secure systems


What You'll Bring
  • 8 years of experience working with cloud services at scale in a high-volume customer-facing environment with a Bachelor's degree in Computer Science or equivalent
  • Willing to participate in on-call rotation
  • Vast experience working in Linux environments, security, and networking with Python, Go, or Bash
  • Very experienced with monitoring and alerting tools such as Prometheus, Grafana, ELK stack, and PagerDuty
  • Experience with deployments in cloud technologies and architectures, CI/CD tools, and configuration management such as Ansible, Terraform, and Kubernetes
  • Proficient with a wide range of relevant server-side technologies such as Consul, Vault, Kafka, MongoDB, PostgreSQL, MySQL
  • Pragmatic, problem-solving approach when designing and implementing solutions


Workplace & Compensation

This role is available throughout Canada. Employees within a 100-kilometer radius of our Toronto office are expected to work from the office on three pre-scheduled "core days" each month to encourage cross-team connection and in-person collaboration.

Compensation includes salary, equity, comprehensive healthcare, paid time off, and other benefits. Our recruiting team will provide a specific salary range based on location and years of experience.

#LI-MQ1 #LI-REMOTE

About Gem.com

Industry
Founded
2013

Similar Jobs

More Jobs at Gem.com

More Information Technology Jobs

Find similar Senior Site Reliability Engineer jobs: