This position will be located in San Diego. This is a fast-paced high tech environment and may require extended hours and after-hours follow up given the nature of the changes occurring 7x24.
Key Responsibilities of the team:
- Company owner for the SWARM critical support model evolution and improvement
- Foster a growth mindset and learning approach for direct and indirect team members
- Develop top talent to create succession plans to rotate high-impact, top talent into this group with other functions and organizations across the company including product development, architects (product and field solution architects)
- Engage with senior leaders in product development and technology architects to drive future designs and architectures with learnings from critical customer incidents
- Shift company technology and as-a-Service investments-and-mindset to “availability as a feature”
- Engage in SWARM activities as required to support deep technical and problem resolution skills to support the Technical Incident Managers in your group including mentoring.
- Enforce regular and systemic process control mechanisms to improve SWARM
- Key success factors and principal metrics include Mean-time-to-restore-service (MTTRS) and avoid recurrences.
- Generate lower level KPI metrics to improve principal metrics.
- Influence and interface with technology groups, development teams, and stakeholders to improve the process and ultimately influence availability prioritization across key backlogs to drive avoidance of recurrences and improved diagnosability/isolation to have the fastest time to restore services across OnPrem and Cloud deployments.
Key Functions of the Director, Critical Response Team:
- Own management/resolution of All Severity 1 (A and B) Incidents
- Manage the critical customer care for all Teradata customers including critical customer sites
- Identify customer sites that are in critical status (Blue, Yellow, Red)
- Manage technical resolution actions and design process to enable internal and external customer-facing communications to return sites to normal operating status
- Support the local account team in communicating status and progress to customers (as required) affected by Severity 1 incidents
- Drive the Post Mortem Actions and/or Closed Loop Corrective Actions processes following all Severity 1 incidents
- Oversee alignment with incident communication roles customer communication guideline process
- Report weekly to Teradata executives on the status and progress of all critical sites
- Manage all BASE level service escalations with the automated self-service model (these sites have no Support Service Manager assigned)
Skills & Qualifications
- Demonstrated strategic and tactical leadership to guide thinking, quantitative and analytical skills, while under pressure;
- Coach technical incident managers and problem managers within the group
- Knowledge of Teradata technologies, architecture, and ecosystem.
- Knowledge and exposure with distributed systems across hyper-scale, cloud-based environments
- Working knowledge of physical IT infrastructures such as Enterprise Server Platforms and related IT architectures and equipment
- Solid understanding of large scale networking, including OSI Model, DNS, WINS, TCP/IP, VLANs, DHCP, Routing, ACLs, switching protocols, etc.
- Understanding and knowledge of physical data centers and their related infrastructure or resources such as power, rack space, CE Infrastructures (e.g. UPS, Generators, AHU), etc.
- Flexibility and willingness to support a 24x7 global operation via off-hours support, on-call availability, or other as needed per rhythm and needs of the business
- Working knowledge of ITIL incident, problem, and change management components
- Excellent problem resolution, judgment, negotiation, and decision-making skills
- Practical experience with incident/outage and crisis management
- Ability to balance competing demands for resources and adapt to changing priorities
- Excellent written and oral communication skills; with special focus on customer/client level interaction
- Operations experience in a 24x7x365 support model (NOC experience beneficial)
- BS in Computer Science, Math or equivalent education or experience with database development and applications, data warehousing operations, and analytical software applications or ecosystems.