Scope, build & manage advanced service monitoring & reporting systems. Delivering real-time monitoring and alerting results for all production systems in the Twitch corporate environment. Work to successfully build metrics-driven approach to application and service monitoring, including capacity planning and escalations into infrastructure teams.
Function as On-call engineer for changes & outages. The first point of contact for customer escalations for downed corporate services.
Driving the evolution of operational excellence through process, documentation and accountability & ownership.
Develop tools to support and promote the management of production environment through logging, alerting, smoke testing, complex actions and the development of metrics.
A history of resolving complex support issues that may not be well documented, and ap