Senior Manager Site Reliability Engineering

ePocrates   •  

Watertown, MA

Industry: IT Consulting/Services

  •  

Less than 5 years

Posted 295 days ago

This job is no longer available.

Healthcare must be continually available.   Infrastructure-as-a-Service ensures the continuous availability of the technologies and systems that are the foundation of athenahealth’s services.   We are directly responsible for thousands of servers, petabytes of storage, and handling thousands of web requests per second, all while sustaining growth at a meteoric rate. We enable an operating system for the medical office that abstracts away administrative complexity, leaving doctors free to practice medicine.
We need talented individuals to bring their technical expertise and creative thinking to help us implement elegant solutions to complex technical problems. Do you have a desire to make a difference?

Position Summary

As a  Senior Manager of Site Reliability Engineering you will lead a team of Site Reliability Engineers laser focused on availability, scalability, and customer experience. The successful candidate should have demonstrated experience managing multidisciplinary teams and have a strong understanding of managing complex systems at scale. Expertise in Linux operating systems, Oracle database systems, large scale web applications, and system infrastructure is highly desired. Experience operating within an Agile environment and evangelizing and maintaining Infrastructure-as-Code is preferred.

Key Responsibilities

  • Leading and developing a multidisciplinary team of Site Reliability Engineers responsible for ensuring product success through strategic use of technologies such as Linux, Oracle, private/public cloud, monitoring, storage, load balancing, and CI/CD
  • Creating and executing against Agile release plans aligned to a common vision and developed through deep engagement with product stakeholders
  • Demonstrating a tenacious desire to improve customer experience through learning, teaching, and implementing resilient systems and services
  • Developing, evangelizing, and promoting best practices for automating systems at scale
  • Leveraging critical thinking and adaptability to lead teams through complex issues
  • Driving the creation of software-based, automated (IaC) solutions to business processes
  • Sharing in the collective team vision and successfully promoting the why and how to all teams
  • Driving the adoption of Site Reliability and Agile principles across the organization


Skills and Environment

  • 3+ years of leading technical teams
  • Computer Science degree or equivalent experience
  • Expertise in supporting professional and technical growth
  • Expertise in leveraging Open Source Software technologies to solve business problems
  • Expertise in monitoring and scaling environments
  • Expertise in public/private cloud IT infrastructure
  • Expertise in building and consuming APIs
  • Experience working in an Agile environment

15046BR