Job DescriptionWe are looking for a
Technical Operations Shift Supervisor. You will monitor the availability, connectivity and performance of high-performance computer systems, massive storage systems, networks and facilities. You will provide leadership, operational direction, and mentoring to computing operations team members while serving as an escalation resource for complex system and facility issues, including Closed Area (CA) rules and guidelines along with computer security guidelines. You will be in the Livermore Computing (LC) Division in the Computing Directorate.
This position requires full-time on-site presence due to the nature of the work.
You will - Provide advanced technical support and data center facilities monitoring for the high-performance systems in LC, including large clusters, disk storage systems, Ethernet, and lnfiniBand networks and Archival Storage systems and perform emergency event management responsibilities.
- Provide UNIX system knowledge and a variety of diagnostic tools to monitor systems including troubleshooting software, hardware, networks, and document issues.
- Coordinate complex shift coverage schedules, including vacations, sick leave, weekend rotations, holidays, and short-staffing situations to maintain operational continuity.
- Provide operational feedback, coaching, and mentoring to team members, and support employee development and training plans.
- Coordinate decommission projects and VTR Responsibilities in support of ongoing technical operations.
- Review incident tickets for accuracy, closure quality, operator participation, and adherence to established repair and response practices.
- Summarize the successes, needs, and challenges of the team and represent them to other shift leads and management.
- Encourage, promote, and suggest new ideas and procedures, that would not only benefit the direct shift but all shifts, as well as opportunities to team build and establish camaraderie with the current shift and the other shifts.
- Update Confluence to document new discoveries and processes, remove old information, get clarity on vague information, and add new technical information.
- Perform other duties as assigned.
Qualifications- Secure and maintain a U.S. DOE Q level security clearance which requires U.S. citizenship.
- Bachelor's degree in a computer or engineering related field or the equivalent combination of technical training and experience.
- Experience troubleshooting problems in a heterogeneous platform environment.
- Advanced knowledge and training in system administration.
- Experience reviewing daily incident tickets, managing decommission projects, or deploying staff for immediate project needs.
- Experience coordinating daily operational coverage, staffing levels, and shift schedules.
- Experience participating in interview, onboarding, and training processes for new team members.
- Experience mentoring and coaching operators, and developing plans and processes to address deficiencies and improve effectiveness.
- Experience recognizing training opportunities and acting as an escalation resource.
- Experience using leadership skills to set team expectations, oversee the creation and sharing of documentation, address challenges and issues, and provide coaching and operational feedback.
- Ability to work all shifts, including Owl shift (12am - 8am), weekends, and holidays.
Qualifications We Desire- Experience with data center utility infrastructure, including water distribution and power distribution systems.
- Experience monitoring and responding to alarms and status information from building management systems.
- Knowledge of cooling distribution units and their role in supporting high-performance computing environments.
- Knowledge of mechanical systems that support mission-critical or data center operations.
- Experience working in environments where facility infrastructure and computing operations are closely integrated.
Pay Range$125,310 - $153,444 Annually
This is the lowest to highest salary we in good faith believe we would pay for this role at the time of this posting; pay will not be below any applicable local minimum wage. An employee's position within the salary range will be based on several factors including, but not limited to, specific competencies, relevant education, qualifications, certifications, experience, skills, seniority, geographic location, performance, and business or organizational needs.
Additional Information#LI-Onsite
Position InformationThis is a Career Indefinite position, open to Lab employees and external candidates.