Ticketmaster is looking for a Director - Cloud and DevOps to lead our growing team of developers /operations gurus with a service reliability mindset. A successful candidate will have a passion for automation, monitoring, self-service and cloud-based technologies. If you are a technical leader who is passionate about DevOps and can lead teams, this role is for you!
This is a technical based management position focusing on development, leadership and oversight of Site Reliability Engineers.
What the job is:
- Maintain 24x7 production environment with a high level of service availability. Perform quality reviews, manage operational issues
- Interface with Dev/QA/OPS teams to identify root cause analysis and re-instrument triggers to prevent future network degradation and outages
- Provide leadership and direction to SRE & DevOps staff that are responsible for break-fix, uptime and reliability for core services, distribution, and customer access network elements and related interfaces
- Partner with development teams in defining and implementing improvements in service architecture
- Implement automation and orchestration for manual processesrequired to operate and deploy cloud services, be at the heart of developing new ideas into internal OPS/SRE tools by working closely with advanced technology and high IT professionals
- Provide leadership and managerial coaching to SRE & DevOps management team across all company’s locations
- Set clear expectations and create a positive work environment based on accountability, in collaboration with the engineering and operations management teams
- Participate in company-wide initiatives to develop design patterns, and champion them on the relevant R&D teams
What a qualified candidate should possess:
- 10+ years of proven experience in support or development globally distributed cloud SaaS services with demonstration Director-Level Management experience
- 3+ years of hands-on experience provisioning, managing and troubleshootinginfrastructure in both private and public cloud (AWS preferred).
- Experience with automation/configuration management using either Puppet, Chef or Ansible
- Expertise in setting and driving standard process in which the DevOps and NOC teams operated
- Extensive experience with Windows servers, Remote Desktop setup and Powershell scripting
- Experience with centralized log management solutions such as Splunk, Grafana or Prometheus
- Ability to use a wide variety of open source technologies and cloud services including micro services
- Deep understanding of the software delivery process with the ability to implement and enforce that process across the organization
- Excellent project management skills. Understands the difference between waterfall, agile, scrum and any other project management tools to effectively strike the right balance.