Fleet Operations Manager, Data Center Infrastructure

Job Overview by Ladders

Qualifications

BS, BA, or BEng in a technical field or equivalent experience
5+ years managing technical teams with performance management responsibilities
10+ years of engineering or operational experience, preferably in mature environments
Strong understanding of data center infrastructure, including power, cooling, and network systems
Proficient in using data and metrics for decision-making and problem-solving
Ability to communicate effectively with diverse audiences and work cross-functionally
Willingness to travel up to 30% of the time

Responsibilities

Build and lead a high-performing data center operations team across multiple locations
Manage maintenance and operations of server hardware and supporting infrastructure at scale
Become a technical expert on Meta's infrastructure to drive site and fleet operations
Drive continuous improvement in engineering and operational performance
Use data analytics to identify inefficiencies and enhance problem-solving capabilities
Collaborate with cross-functional teams to maintain fleet health and ensure operational efficiency
Evolve processes to scale globally while fostering a culture of accountability and innovation

Benefits

Comprehensive health insurance options
Retirement savings plan with company match
Flexible work arrangements
Opportunities for professional development and training
Culture of innovation and collaboration

Full Job Description

The Fleet Operations Manager is accountable for managing and leading a geographically dispersed team, delivering SLA/KPI's related to production server hardware, resolution of systemic technical issues, and repairs throughout the assigned geographic region of data centers. We are looking for someone who can effectively prioritize and adapt to shifting priorities in a dynamic operational environment. The ideal candidate is an IT professional with strong leadership skills and experience in Server Hardware, Project Management, Quality Management, Data Analytics, Networks, OS repair, Linux and Automation, ideally in a datacenter environment. Having an extensive understanding of managing servers in a large-scale distributed environment

Responsibilities

Build and lead a geographically dispersed, high-performing data center operations team, developing both the technical capabilities and leadership qualities of engineers
• Establish and manage a Data Center Operations Team accountable for the maintenance and operation of server hardware and supporting infrastructure at scale
• Become a technical expert in Meta's infrastructure, including platforms, tools, systems, architecture, workflows, and performance
• Provide strategic direction, guidance, and support for site and fleet-level operations
• Analyze and drive continuous improvement in the engineering and operational performance of our data centers
• Employ data analytics to identify inefficiencies, opportunities, exceptions, and correlations in a complex, highly interconnected, technical environment. Enable rapid and effective problem solving, along with proactive identification and mitigation of risks and issues
• Collaborate with cross-functional partner teams to ensure fleet health and maintain targeted capacity levels, resulting in optimized operations, minimized downtime, and seamless scalability
• Evolve and optimize processes in a globally consistent way to allow Meta to scale and grow effectively
• Support and mentor engineers in their day-to-day work, as well as in finding opportunities to develop and grow based on their areas of strength and interest
• Create and drive a culture of ownership, innovation, collaboration, accountability, continuous improvement, and safety
• Conduct performance management for a technical engineering team, providing clear expectations and goals
• Assume the role of incident manager during large-scale, site-wide, and region-wide production-impacting events, as the primary point of contact for your site. This requires working cross-functionally to scope problems, mitigate risks, affect fixes, and communicate the nature, status, and resolution plan for incidents
• Support and contribute thought leadership to the development and implementation of business practices, processes and automated tooling
• Develop deep knowledge and ownership of a hyper-scale computing fleet through the use of data analysis to identify trends and systemic issues and opportunities; reporting out globally and sharing with peers as appropriate

Minimum Qualifications
• BS, BA, or BEng in a technical field or commensurate experience
• Ability to travel up to 30% is required
• Experience participating in or leading technical projects related to areas such as process improvement, technology, and/or automation, including bringing in additional expertise as needed
• 5+ years of experience managing teams of technical resources, including people and performance management responsibilities
• Understanding of data center infrastructure and/or operations, including power, cooling, and/or network systems; structured cabling; and management of projects, incidents, and vendors
• Experience using data and metrics to drive decision-making
• Ability to influence effectively, working on cross-functional teams to advance the needs of the company and adapting teams to meet these needs
• 10+ years of engineering or operations experience, preferably in a mature engineering or operations environment, working with cross-functional teams
• Ability to communicate effectively, in a clear and concise manner, appropriately tailoring messages to the audience

Preferred Qualifications
• Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
• Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
• Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
• Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
• Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
• Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
• Six Sigma knowledge/certification
• Experience leading technical resources using Linux or an equivalent OS to support hardware systems in a complex IT environment
• Experience with large-scale AI implementations and the use of AI to drive automation
• Experience in large-scale data center hardware deployments and building scalable infrastructure
• Knowledge of the interdependencies of data center functions and technologies, including electrical, cooling, structured cabling, security, and network

* Ladders Estimates

Similar Jobs

External Development Planning & Operations Manager (Apex Legends)
$104K — $142K *
Electronic Arts Inc
Vancouver, BC V5K 5J9
Today
Operations Manager
$106K — $181K *
Hexcel Corporation
Burlington, WA 98233 (Skagit County)
Today
Operations Manager
$90K — $120K *
Valence
Everett, WA 98208 (Snohomish County)
Today
Operations Manager
$96K — $160K *
ULINE
Lacey, WA 98503 (Thurston County)
Today
Operations Manager
$96K — $160K *
ULINE
Centralia, WA 98531 (Lewis County)
Today
Operations Manager
$96K — $160K *
ULINE
Olympia, WA 98501 (Thurston County)
Today

Get Ready For Your
Next Interview

More Jobs at Meta

Product Designer, Human Interface
$120K — $150K *
Burlingame, CA 94010 (San Mateo County)
Today
Consumer Technology
In-Person
Data Center Site Selection Manager
$120K — $150K *
Seattle, WA 98115 (King County)
Today
Real Estate & Construction
In-Person
Data Center Site Selection Manager
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Data Center Site Selection Manager
$130K — $180K *
Remote
Today
Real Estate & Construction
Remote in United States
Data Center Site Selection Manager
$120K — $150K *
Austin, TX 78745 (Travis County)
Today
Information Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
1 week ago
Spécialiste Senior, Automatisation des Données et Ingénierie /Sr. Specialist, Data Automation Engineering
$94K — $157K *
McKesson
Saint-laurent, QC H4K 1H9
Today
Machine Learning Engineer, Siri Global
$130K — $180K *
Apple
Cupertino, CA 95014 (Santa Clara County)
Today
Tools and Automation Engineer
$172K — $258K *
Apple
Cupertino, CA 95014 (Santa Clara County)
Today
Sr. Data Analyst MAHIN-JOB-34054
$90K — $130K *
Keylent, Inc.
Dallas, TX 75217 (Dallas County)
Reposted Today

Find similar Fleet Operations Manager, Data Center Infrastructure jobs:

Nationwide Hillsboro, OR

Fleet Operations Manager, Data Center Infrastructure

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Fleet Operations Manager, Data Center Infrastructure jobs:

Get Ready For Your
Next Interview