Vertafore, Inc.

Principal Site Reliability Engineer

Vertafore, Inc.$160K — $180K *
Information Technology
11 - 15 years of experience
Job Overview by Ladders

Qualifications

  • 12-15+ years of experience in Cloud Operations, SRE, or reliability engineering.
  • Proven ability to operate at a Principal/Architect level with global impact.
  • Expert-level software engineering skills in languages such as C#, .NET, Java, Python, or React.
  • Deep expertise in scaling SRE principles like SLIs, SLOs, and error budgets.
  • Mastery of AWS, Kubernetes, CI/CD pipelines, and Infrastructure-as-Code techniques.
  • Bachelor's or Master's in Computer Science or a related technical field.
  • Willingness to participate in an executive on-call rotation as needed.

Responsibilities

  • Define enterprise-wide reliability standards and service ownership.
  • Lead architectural initiatives to enhance system design for fault tolerance and compliance.
  • Establish and implement the observability strategy focusing on latency, traffic, errors, and saturation.
  • Manage SLIs and SLOs governance across multiple product lines.
  • Champion error budgets to balance feature velocity and platform stability.
  • Lead incident responses for high-severity events and promote a blameless culture.
  • Engage in cultural transformation towards engineering-centric operations.

Benefits

  • Flexible First work environment supporting remote and collaborative work.
  • Comprehensive medical, vision, and dental plans with multiple options.
  • Life insurance, AD&D coverage, and short/long-term disability plans.
  • 401(k) plan with employer match and pension plan options.
  • Parental leave and employee assistance programs available.
  • Tuition reimbursement and education assistance initiatives.
  • Additional programs including employee referral and internal recognition.
Full Job Description
$160,000 - $180,000 / year + Bonus

We are seeking a Principal Site Reliability Engineer to define the strategic vision and own the enterprise-wide reliability, scalability, and performance of our critical production services. As a foundational pillar of our engineering organization, this role drives architectural standards for the full-service lifecycle-from initial design and deployment readiness to proactive production operations. At Vertafore, we view reliability as a core engineering responsibility. You will operate autonomously across AWS, hybrid data centers, and customer-hosted environments, setting the technical direction for how we treat operations as a software engineering challenge. This role is pivotal in transitioning cross-departmental teams toward a highly proactive, engineering-first culture.

Roles and Responsibilities:

Strategic Leadership & Reliability Architecture

  • Enterprise-Wide Ownership: Define the standards for end-to-end service ownership, holding the organization accountable for availability, performance, and overall operational health.


  • Architectural Influence: Lead cross-departmental initiatives to influence system design at the architectural level, driving fault tolerance, strict compliance, and operational sustainability across public and private clouds.


  • Advanced Observability Vision: Dictate the enterprise strategy for observability frameworks, ensuring the Four Golden Signals (Latency, Traffic, Errors, and Saturation) provide actionable, predictive insights across all platforms.


Strategic Leadership & Reliability Architecture

  • Enterprise-Wide Ownership: Define the standards for end-to-end service ownership, holding the organization accountable for availability, performance, and overall operational health.


  • Architectural Influence: Lead cross-departmental initiatives to influence system design at the architectural level, driving fault tolerance, strict compliance, and operational sustainability across public and private clouds.


  • Advanced Observability Vision: Dictate the enterprise strategy for observability frameworks, ensuring the Four Golden Signals (Latency, Traffic, Errors, and Saturation) provide actionable, predictive insights across all platforms.


Data-Driven Reliability Governance

  • SLO & Error Budget Authority: Establish the governance models for defining and managing SLIs and SLOs across multiple product lines.


  • Delivery Alignment: Champion Error Budgets as the ultimate technical arbiter at the executive level, balancing feature velocity with the absolute requirement for platform stability.


Incident Management & Cultural Transformation

  • Enterprise Incident Command: Lead incident response for the most critical, high-severity events.


  • Blameless Culture Champion: Foster a "Win Together" environment by championing a Blameless Postmortem culture globally, ensuring root cause analyses focus strictly on systemic and process improvements rather than individual error.


Qualifications & Requirements

  • Experience: 12 to 15+ years of hands-on Cloud Operations, SRE, or reliability-focused engineering experience, with a proven track record of end-to-end enterprise service ownership.


  • Proven Scope: Demonstrated ability to operate at a Principal/Architect scope, driving large-scale reliability outcomes and operational excellence across global organizations.


  • Software Engineering: Expert-level software engineering skills in C#, .NET, Java, Python, or React.


  • Principles: Deep expertise in scaling core SRE principles (SLIs, SLOs, error budgets) across complex, distributed systems.


  • Technical Stack: Mastery of AWS, Kubernetes, CI/CD pipelines, Infrastructure-as-Code, and extensive knowledge of Linux and Windows environments and relational databases.


  • Education: Bachelor's or Master's degree in Computer Science or a related technical field.


  • Commitment: Participation in an executive on-call rotation with flexible hours as required


Skills & Requirements
Knowledge, Skills, and Abilities:
• A fast learner.
• A problem solver.
• Ability to document procedures.
• Able to meet deadlines.
• Good communication skills. Able to deliver the message effectively to a technical and non-technical audience.
• Able to comply with processes and procedures.
• Able to maintain professional composure in any situations.
• Flexible in working extended hours on occasions or as required.
• Exposure in the insurance industry is desired but not mandatory.
• Driven to improve, personally and professionally
• Operate best in a fast-paced, flexible work environment with ability to work in a team.

Additional Requirements and Details:
• High speed internet to accommodate working from home needs.
• Occasional lifting and/or moving up to 10 pounds.
• Frequent repetitive hand and arm movements required to operate a computer.
• Specific vision abilities required by this job include close vision (working on a computer, etc.).
• Frequent sitting and/or standing.

Qualifications Vertafore is a Flexible First working environment which allows team members to work from home as often as you'd like, while using our offices as a place for collaboration, community, and teambuilding. There are times you may be asked to come into an office and/or travel for specific meetings for a specific business purpose and this varies by job responsibilities.

Why Vertafore is the place for you: *Canada Only
  • The opportunity to work in a space where modern technology meets a stable and vital industry
  • Medical, vision & dental plans
  • Life, AD&D
  • Short Term and Long Term Disability
  • Pension Plan & Employer Match
  • Maternity, Paternity and Parental Leave
  • Employee and Family Assistance Program (EFAP)
  • Education Assistance
  • Additional programs - Employee Referral and Internal Recognition


Why Vertafore is the place for you: *US Only
  • The opportunity to work in a space where modern technology meets a stable and vital industry
  • We have a Flexible First work environment! Our North America team members use our offices for collaboration, community and team-building, with members asked to sometimes come into an office and/or travel depending on job responsibilities. Other times, our teams work from home or a similar environment.
  • Medical, vision & dental plans
    • PPO & high-deductible options
  • Health Savings Account & Flexible Spending Accounts Options:
    • Health Care FSA
    • Dental & Vision FSA
    • Dependent Care FSA
    • Commuter FSA
  • Life, AD&D (Basic & Supplemental), and Disability
  • 401(k) Retirement Savings Plain & Employer Match
  • Supplemental Plans - Pet insurance, Hospital Indemnity, and Accident Insurance
  • Parental Leave & Adoption Assistance
  • Employee Assistance Program (EAP)
  • Education & Legal Assistance
  • Additional programs - Tuition Reimbursement, Employee Referral, Internal Recognition, and Wellness
  • Commuter Benefits (Denver)

About Vertafore, Inc.

Vertafore, Inc. is a software company that provides insurance technology solutions. The company was founded in 1969 and is headquartered in Bothell, Washington. Vertafore operates in the technology sector and has a workforce of over 1700 employees. The company's products and services are designed to help insurance agencies and carriers manage their operations more efficiently. Vertafore's solutions include agency management, rating and connectivity, content management, and compliance and licensing. The company has a strong presence in the United States and serves over 20,000 customers.
Learn more about Vertafore, Inc.
Size
1,700 employees
Industry
Founded
1969

Similar Jobs

More Jobs at Vertafore, Inc.

More Information Technology Jobs

Find similar Principal Site Reliability Engineer jobs: