Job Description
As a Senior Principal Reliability Engineer, you will lead the evolution of how reliability is engineered, measured, and improved across IT systems. You will play a critical role in enabling engineering teams to build systems that are reliable by design, while shaping enterprise practices that scale across the organization. This is a highly visible and impactful role with the potential to significantly improve the reliability, resilience, and operational effectiveness of the IT products that power our company’s mission.
Responsibilities
Build relationships across the broader IT organization to increase adoption and maturity of SRE, Observability, and Resilience practices
Define and evolve the strategic vision for enterprise reliability engineering and ensure alignment across product, platform, and ITSM teams
Establish and enforce standards for Service Level Objectives, observability frameworks, and resilience engineering practices
Collaborate with engineering teams to ensure reliability is embedded into architecture, design, and delivery processes
Drive adoption of Service Level Objectives using Nobl9 as the system of record for reliability governance
Lead evaluation and introduction of new technologies that improve reliability outcomes while integrating with existing platforms
Apply AI capabilities to enhance reliability practices, including incident triage, diagnostics, and automation, in a governed and controlled manner
Collaborate within efforts to standardize observability across logs, metrics, traces, and events to improve system visibility and decision-making
Consult and promote resilience patterns including fault isolation, failover strategies, and recovery mechanisms
Guide improvements surrounding incident lifecycle effectiveness, including detection, response, root cause analysis, and continuous improvement
Lead and mentor a community of reliability practitioners to grow organizational capability and maturity
Represent reliability engineering practice in architecture reviews, governance forums, and key IT initiatives
Drive continuous improvement of reliability practices through research, innovation, and feedback from engineering teams
Qualifications Required
Bachelors degree in IT, Engineering, Computer Science, or related field
Minimum 7 years experience in site reliability engineering
Expertise in capacity management, system integration, software development, release management, network design, configuration management (CM), software development life cycle (SDLC), system administration, change controls, and solution architecture
Proficiency in designing, managing, developing, and maintaining technological products, particularly in the animal health domain
Strong expertise in hardware, mechanics, artificial intelligence, and software development
Experience in program management, including product definition, development, testing, maintenance, and tier 4 support
Ability to conduct technological and product research and drive innovation
Skilled in developing and managing CI/CD pipelines for product development cycles
Knowledge of performance optimization and server software management
Experience with application deployment to both cloud and on-premises production environments
Understanding of product security, company development policies, and open source usage
Strong leadership skills including strategic planning, entrepreneurship, innovation, and business savviness
Proven track record in coaching and development, talent growth, and execution excellence
Strong commitment to inclusion, with the ability to influence and motivate others
Excellent emotional intelligence, decision-making skills, and a strong sense of ownership and accountability
Networking and partnerships should be a key strength
Required Skills:
Data Engineering, Data Visualization, Design Applications, Software Configurations, Software Development, Software Development Life Cycle (SDLC), Solution Architecture, System Designs, System Integration, Testing
Preferred Skills:
Current Employees apply
Current Contingent Workers apply
The salary range for this role is
$142,400.00 - $224,100.00
This is the lowest to highest salary we in good faith believe we would pay for this role at the time of this posting. An employee’s position within the salary range will be based on several factors including, but not limited to relevant education, qualifications, certifications, experience, skills, geographic location, government requirements, and business or organizational needs.
The successful candidate will be eligible for annual bonus and long-term incentive, if applicable.
We offer a comprehensive package of benefits. Available benefits include medical, dental, vision healthcare and other insurance benefits (for employee and family), retirement benefits, including 401(k), paid holidays, vacation, and compassionate and sick days. More information about benefits is available at .
You can apply for this role through (or via the Workday Jobs Hub if you are a current employee). The application deadline for this position is stated on this posting.
Employee Status:
Regular
Relocation:
No relocation
VISA Sponsorship:
Yes
Travel Requirements:
10%
Flexible Work Arrangements:
Hybrid
Shift:
1st - Day
Valid Driving License:
No
Hazardous Material(s):
N/A
Job Posting End Date:
06/4/2026
*A job posting is effective until 11:59:59PM on the day BEFOREthe listed job posting end date. Please ensure you apply to a job posting no later than the day BEFORE the job posting end date.