Systems Operations Manager – Data Platforms (Teradata & Hadoop)
This role is in the office three days a week.
No visa sponsorship or visa transfers.
About this role:
Wells Fargo is seeking a Systems Operations Manager to lead the end-to-end support and operations of enterprise Teradata and Hadoop data platforms powering large-scale analytics and business decisioning.
This role is accountable for platform stability, reliability, and operational excellence across a complex, multi-tenant ecosystem supporting 100+ tenants. The manager will lead a 24x7 operations team, apply Site Reliability Engineering (SRE) principles, and drive automation-led transformation to ensure predictable, resilient service delivery at scale.
This is a hands-on leadership role requiring strong execution discipline, ownership, and the ability to operate in a high-risk, regulated environment, ensuring SLA adherence, compliance, and business continuity outcomes.
In this role, you will:
Operational Leadership & Platform Ownership
- Lead end-to-end platform operations for Teradata and Hadoop environments, ensuring availability, performance, and resilience
- Provide clear ownership and accountability for production services, operational outcomes, and service stability
- welDrive incident, problem, and change management, including major incident command and recovery leadership
- Lead 24x7 global support operations, including on-call governance and escalation management
Operational Excellence & Service Performance
- Own and drive SLA/OLA adherence, uptime, and service health metrics
- Lead capacity management, performance tuning, and proactive issue prevention initiatives
- Establish and enforce operational standards, runbooks, and service management practices
- Drive root cause analysis (RCA) and long-term remediation of systemic issues
- Drive adoption of automation, observability, and AIOps practices to reduce manual toil and improve MTTR.
Governance, Risk & Compliance
- Ensure alignment with enterprise risk, compliance, and change management frameworks
- Drive patching, vulnerability remediation, and platform security posture
- Maintain audit readiness, documentation quality, and control adherence
- Identify, escalate, and mitigate operational and platform risks
Multi-Tenant Platform Operations
- Manage operations across shared, multi-tenant platforms, ensuring workload isolation and stability
- Oversee resource allocation, scheduler configuration, and workload prioritization
- Execute in high-risk production environments where changes impact multiple tenants simultaneously
Site Reliability Engineering (SRE) & Automation
- Apply SRE principles to improve reliability, availability, and scalability of data platforms
- Drive automation-first operations to eliminate manual toil and standardize service delivery
- Implement and enhance observability, monitoring, and self-service capabilities
- Partner with engineering teams to improve platform reliability, operability, and service maturity
- Drive adoption of automation, observability, and AIOps practices to reduce manual toil and improve MTTR.
Stakeholder Engagement & Execution Alignment
- Partner with Engineering, CIO-aligned teams, Cybersecurity, and LOB stakeholders
- Provide clear, executive-ready communication on platform health, risks, and priorities
- Drive cross-functional accountability and execution discipline across teams
People Leadership & Talent Development
- Lead, coach, and develop a team of Systems Operations engineers and analysts
- Build a culture of ownership, accountability, and operational excellence
- Manage resource allocation, workforce planning, and vendor/partner support
- Develop team capabilities in SRE practices, automation, and platform operations maturity
Resiliency & Business Continuity
- Ensure resiliency posture across Teradata and Hadoop platforms, including:
- Disaster recovery (DR) readiness and execution
- RTO/RPO alignment and validation
- Continuous improvement of recovery capabilities
- Lead BCP execution and failover coordination for critical platforms
Required Qualifications:
5+ years of Systems Engineering, and Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
2+ years of Leadership experience
- Hands-on experience with:
- Teradata and Hadoop platforms
- Distributed systems and data platform operations
- Incident, problem, and change management processes
Desired Qualifications:
- Experience supporting enterprise-scale Teradata and Hadoop platforms
- Demonstrated leadership in 24x7 production support and SRE environments
- Strong experience in:
- Automation, AIOps, and operational transformation
- DevSecOps and CI/CD practices
- Observability, monitoring, and platform telemetry
- Familiarity with Kubernetes, containerization, and cloud-native architectures
- Strong understanding of:
- Multi-tenant data platforms and workload management
- Regulatory, audit, and risk-controlled environments
Job Expectations:
- Provide full operational ownership of Teradata and Hadoop platforms
- Drive platform reliability, stability, and performance improvements
- Ensure compliance, patching, and security governance alignment
- Operate within a 24x7 production support model, including on-call leadership
- Partner across CIO organizations to deliver scalable, resilient data platform services
Posting End Date:
13 Jun 2026
*Job posting may come down early due to volume of applicants.