Site Reliability Engineering Manager

Cisco   •  

San Francisco, CA

8 - 10 years

Posted 240 days ago

This job is no longer available.

Meraki is making IT easier, faster, and smarter with technology that simply works. Meraki’s Dashboard drastically simplifies the process of deploying and configuring distributed networks, enabling customers to debug networking issues in real time from anywhere. The Infrastructure & SRE teams build and maintain the platform for Dashboard. Our customers aren’t limited to those who buy Meraki equipment, we measure our success by the experience of everyone who interacts with our product, from other teams across the organization to the end users of our customers’ networks.

As a manager at Meraki, you will provide architectural leadership, review code and designs, and debug existing features. You will be a strong mentor, nurturing a collaborative environment where your team can do their best work. In this highly technical role you will make your own direct contributions to our code base, but your primary focus will be on team execution and development.

Example projects / responsibilities of a SRE Manager:

  • Holding regular one-on-ones with your team, ongoing coaching and mentorship, writing performance evaluations, and compensation planning.

  • Helping your team navigate decisions between in-house solutions and open source products; striking a balance between self-learning, formal training, conferences, and commercial support.

  • Working with your senior individual contributors to identify and scope areas for improvement; such as migrating from a vast collection of homegrown configuration management scripts to a modern, scalable configuration management tool. (We picked ansible)

  • Working with other teams, such as product management, to deliver on projects important to the business; such as planning, testing & deploying IPv6 to our infrastructure.

  • Building a comprehensive, modern monitoring platform which enables developers on any team to receive alerts in a way and at a time that suits them.

  • Day to day support of Dashboard, including responding to outages and triaging cases escalated by our support team.

You are an ideal candidate if:

  • You are interested in a leadership role because it allows you to work closely with others to help them achieve their best. (Ideally you have at least a year of experience leading a team of 3+ people and are looking for an opportunity to take on more responsibility and lead a larger team)

  • You have 7+ years of experience developing systems for large-scale production environments. Our stack is primarily ansible, elk, ruby and bash, but your experience might be with comparable tools.

  • You have experience building rapidly growing systems that balance time to market, usability, reliability, and technical debt.

  • You believe that development and operations are best carried out in close co-operation.

  • You are never satisfied by just seeing something work. Your curiosity drives you to peek behind the curtain to gain a deeper understanding of what’s going on.

  • You are experienced with a variety of tools that help you manage, understand, and debug large, complex distributed systems.

Bonus points for experience with any of the following:

  • Supporting products where users expect 100% uptime.

  • Formal project management or IT service management.