Sr. Site Reliability Engineer

Mike Albert Fleet Solutions

$100K — $130K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Minimum 5 years in SRE, DevOps, or Cloud Engineering with production experience in Azure or AWS.
  • Strong Linux/Unix administration and networking troubleshooting skills.
  • Expertise in Infrastructure as Code (Terraform) and CI/CD pipeline design.
  • Proficiency in scripting or programming (Python, Go, Bash, or PowerShell).
  • Expertise with Docker in production environments; Kubernetes experience is a strong plus.
  • Excellent oral and written communication skills, capable of engaging with diverse technical audiences.

Responsibilities

  • Lead modernization efforts and hosting migrations while maintaining hybrid infrastructure (Azure, AWS, on-prem).
  • Streamline provisioning using Infrastructure as Code (Terraform, Ansible, PowerShell DSC).
  • Enhance CI/CD pipelines (GitHub Actions, Jenkins) for rapid delivery.
  • Introduce and manage comprehensive monitoring platforms (e.g. Prometheus, Grafana, Datadog).
  • Manage containerized workloads in Docker and implement SRE principles including SLOs and error budgets.
  • Lead incident response, conduct root cause analysis, and automate manual processes through scripting and runbooks.
  • Support DevSecOps initiatives and ensure robust backup and disaster recovery strategies.

Benefits

  • Hybrid work schedule with three days a week in office.
  • Opportunity to drive critical infrastructure modernization and hosting migration initiatives.
  • Work within a small, high-impact infrastructure team.
  • Continuous evaluation and piloting of emerging technologies and tools.
Full Job Description
Sr. Site Reliability Engineer

We are seeking an experienced Sr. Site Reliability Engineer to join a small, high-impact infrastructure team. This role blends software engineering and systems automation to scale reliable cloud and hybrid systems. You will own critical projects from design through deployment, specifically driving a pivotal infrastructure modernization and hosting migration initiative in your first year.

Responsibilities:
  • Infrastructure & Migration: Lead modernization efforts and hosting migrations while maintaining hybrid infrastructure (Azure, AWS, on-prem).
  • Automation &IaC: Streamline provisioning using Infrastructure as Code (Terraform, Ansible, PowerShell DSC) and enhance CI/CD pipelines(GitHub Actions, Jenkins)for rapid delivery.
  • Observability: Introduce and manage comprehensive monitoring platforms (e.g. Prometheus, Grafana, Datadog)to establish operational standards.
  • Reliability Engineering: Manage containerized work loads in Docker and implement SRE principles including SLOs and error budgets.
  • Operational Excellence: Lead incident response, conduct root cause analysis, and automate manual processes through scripting and runbooks.
  • Security & Continuity: Support DevSecOps initiatives and ensure robust backup and disaster recovery strategies.
  • Emerging Tech & R&D: Continuously evaluate and pilot emerging technologies, tools, and industry trends to ensure our infrastructure stack remains modern, efficient, and scalable.

Qualifications:
  • Minimum 5 years in SRE, DevOps, or Cloud Engineering with production experience in Azure or AWS.
  • Strong Linux/Unix administration and networking troubleshooting skills.
  • Expertise in Infrastructure as Code (Terraform) and CI/CD pipeline design.
  • Proficiency in scripting or programming (Python, Go, Bash, or PowerShell).
  • Expertise with Docker in production environments; Kubernetes experience is a strong plus.
  • Self-starter with strong communication and documentation skills; able to take ownership in a small-team environment.
  • Excellent oral and written communication skills. Able to communicate effectively with a diverse group of individuals with varying levels of technical understanding and varying skillsets.
  • Ability to work collaboratively with application development and data engineering areas to define standards and manage change.
  • Must reside in the Greater Cincinnati Metropolitan Area (Hybrid Schedule 3 days a week in office)


Preferred Qualifications:
  • Practical knowledge of observability tools (Prometheus, Grafana, ELK, or similar).
  • Windows Server/Active Directory administration.
  • Experience with legacy Unix (AIX/Solaris).
  • Database (Oracle/MS-SQL) or BI platform experience (Snowflake/Azure Fabric).
  • Relevant industry certifications (Azure, AWS, or CKA).


Hybrid Schedule three days a week in office required

Similar Jobs

More Jobs at Mike Albert Fleet Solutions

More Information Technology Jobs

Find similar Sr. Site Reliability Engineer jobs: