Who You Are
The DevOps Manager oversees a team of highly skilled System Engineers, Site Reliability Engineers, and Cloud Administrators who are responsible for direct support, maintenance, troubleshooting, deployment control, and optimal delivery of customer-facing products, services, and content. This includes strong collaboration with personnel working in software development, operational infrastructure (including cloud), and production systems to ensure the stability, security, and efficiency of our products and services, as well as enforcing appropriate workflows, controls, and procedures as defined by the business.
This individual is expected to have extensive technical expertise, and will work side by side with their direct reports. The ideal candidate will have knowledge of and/or deep expertise in development lifecycles and methodologies, cloud technologies, information security, large data sets, and traditional infrastructure including networking, databases, virtualization, and storage. A strong work ethic, transparent collaboration, and exceptional verbal and written communication skills are required.
- Manage, refine, and grow a team of top talent that successfully handles all Team Responsibilities below, acting as the accountable individual for all team actions.
- Set clear expectations and goals for team members.
- Provide timely feedback on individual and team performance.
- Ensure high visibility of individual and team success.
- Take corrective action when necessary.
- Oversee and ensure the integrity of production systems on a day-to-day basis to ensure that SLAs are met, problems addressed, and status is clearly communicated to appropriate personnel in a timely fashion.
- Resolve technical issues across all environments/areas of responsibility, and make recommendations for performance and/or capacity improvements.
- Implement tools and procedures in support of continuous integration, static and dynamic code analysis, deployment and application monitoring, and general automation of build and configuration workflows.
- Participate in all production support activities during incidents and outages.
- Participate in capacity planning, tuning systems stability, provisioning, performance, and scaling of the application infrastructure.
- Research and implement new/best DevOps practices and processes including automation, continuous delivery and service discovery.
- Advise and assist operations in improving uptime, reducing service incidents, and planning capacity and capabilities to support products and services.
- Advise and assist development in automation, deployment controls, operational capability, and accelerating software deployment.
- Maintain multiple, consistent environments (Development, Test, QA and Production).
Experience We Are Looking For
- BS in Computer Science or a related field, or an equivalent combination of education and experience.
- 10+ years multi-platform system administration experience. Expertise in both Windows and Linux is preferred.
- 5+ years of experience with cloud and virtualization technologies. Strong expertise with Azure is required, experience with Amazon Web Services and other cloud technologies is a plus.
- 2+ years of management/supervisory experience.
- Strong understanding of firewalls, load balancing, routing & switching, DNS, SMTP, TCP/IP, Active Directory, Group Policy Objects, LDAP, monitoring, web servers, and web applications is required.
- Solid programming, scripting, and automation skills/knowledge are required. Experience with Perl, Python, Ruby, Shell, and/or Java is desirable.
- Knowledge of information security best practices is preferred. Industry certifications such as CISSP or CISM are a plus.
- Experience with Scrum principles, practices and theory is preferred.
- Experience transitioning teams to an Agile Framework is a plus.