Royal Bank of Canada

Senior Site Reliability Engineer

Royal Bank of Canada$100K — $130K *
Finance & Insurance
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 4+ years of SRE or Systems Engineering experience
  • Bachelor's degree in Computer Science, Engineering, or related field
  • Expertise in infrastructure-as-code and configuration management with Ansible
  • Advanced scripting capabilities in languages like Bash, Python, and PowerShell
  • In-depth knowledge of observability tools such as Elasticsearch and Dynatrace
  • Familiarity with AI/ML concepts in operations and observability
  • Experience creating and maintaining reliability metrics like SLIs and SLOs

Responsibilities

  • Contribute to the SRE product base, enhancing monitoring and alerting
  • Implement AI-driven monitoring and observability across applications
  • Design ML-based anomaly detection for predictive alerting
  • Architect solutions for AI-driven self-healing incidents
  • Collaborate to standardize telemetry data and enhance observability
  • Develop automation processes with Ansible and GitHub Actions
  • Participate in incident and problem management, focusing on root cause analysis

Benefits

  • Comprehensive Total Rewards Program including flexible benefits
  • Support for development through coaching and management opportunities
  • Opportunity to make a lasting impact in the community
  • Work in a dynamic, collaborative, and high-performing team
  • Access to a world-class training program in financial services
  • Flexible work/life balance options
  • Challenging work opportunities
Full Job Description
Job Description

What is the opportunity?

RBC Insurance Technology is seeking to hire a Senior Site Reliability Engineer for its Insurance Technology Platform Support team. The Insurance Technology Platform Support Team is a specialized unit dedicated to ensuring the optimal performance, availability, and resilience of IT applications used in the insurance line of business. With a unique blend of technical expertise and industry-specific knowledge, this team plays a critical role in ensuring the seamless operations of digital services that cater to both the business's internal and external stakeholders.

As a Senior Site Reliability Engineer, you will bring the engineering mindset of bold ambition, curiosity and outcome focus to ensuring the performance and reliability of our systems. This role calls for a dynamic individual who excels in a collaborative environment, working with cross-functional teams to implement best practices for observability, monitoring, logging, alerting, and automation. As we evolve toward AI-driven autonomous operations, you will play a key role in transitioning from traditional reactive incident response to intelligent, self-healing systems. This role will be responsible for the development, implementation, and support of Site Reliability Engineering (SRE) solutions for applications supported by RBC Insurance Technology. You'll leverage your proficiency in Elasticsearch, Ansible, GitHub Actions, Moogsoft, PagerDuty, Dynatrace, and emerging AIOps platforms to build and maintain robust automation, intelligent observability, and AI-enhanced SRE tooling.

What will you do?
  • Contribute to the SRE product base (intelligent monitoring, alerting, machine learning anomaly detection, Agentic AI self-healing, reliability testing)
  • Implement and enhance AI-driven monitoring and intelligent observability capabilities across supported applications
  • Design and implement ML-based anomaly detection pilots, transitioning from rule-based to predictive alerting
  • Architect and develop Agentic AI self-healing solutions that autonomously remediate common incidents
  • Design human-AI workflows that balance automation efficiency with appropriate human oversight and governance
  • Standardize application telemetry data to increase coverage of signal types, building the foundation for advanced AI/ML capabilities
  • Contribute to centralization of observability and monitoring backends for advanced telemetry correlation
  • Collaborate with cross-functional teams to implement best practices for monitoring, logging, and incident response, driving a proactive stance on system health
  • Implement and manage automation processes with Ansible and GitHub Actions to streamline operational tasks
  • Develop and maintain custom tooling and automation scripts in languages like Bash, Python, and PowerShell to enhance operational efficiency and system reliability
  • Work closely with development teams to understand code changes and their impact on the production environment, ensuring that new releases meet our reliability standards
  • Actively contribute to the definition and tracking of SLIs, SLOs, and other critical metrics, refining our alerting and monitoring strategies accordingly
  • Evolve runbooks into automated remediation workflows and Agentic AI automation, reducing manual intervention
  • Create and refine custom tooling and automation scripts using languages such as Bash, Python, and PowerShell, supporting the infrastructure's scalability and reliability needs
  • Support deployments by advocating for reliability and performance improvements based on industry trends and company objectives
  • Participate in incident management and problem management for applications in scope and contribute to RCA Action items fulfillment
  • Validate and govern AI outputs to ensure compliance with financial services regulations and maintain human accountability for AI-driven decisions
  • Drive transformation by continuously looking for ways to automate existing processes and adopt intelligent operations
  • Debug production issues across services and levels of the stack and provide primary operational support
  • Perform production support role, including off-hours support (as part of an on-call rotation)


What do you need to succeed?

Must-have
  • 4+ years of SRE or Systems Engineering experience with strong technical expertise
  • Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent experience
  • Expertise in infrastructure-as-code and configuration management, particularly Ansible
  • Advanced scripting capabilities in Bash, Python, PowerShell, or other similar languages
  • In-depth knowledge of tools such as Elasticsearch, Ansible, GitHub, OpenShift, Kubernetes, Dynatrace, Kafka, and their role in system reliability
  • Knowledge of creating, maintaining, and alerting on SLIs, SLOs, and other reliability metrics
  • Understanding of AI/ML concepts and their application to observability and operations (AIOps)
  • Experience with or strong interest in intelligent monitoring, anomaly detection, and automation technologies
  • Ability to design and implement human-AI workflows with appropriate governance controls


Nice-to-have
  • Insurance or financial services industry experience
  • Hands-on experience with AIOps platforms and intelligent observability tools
  • Experience with ML anomaly detection, predictive analytics, or self-healing automation
  • Knowledge of prompt engineering and AI model tuning for operational use cases
  • Experience designing Agentic AI or autonomous remediation systems
  • Familiarity with AI governance frameworks and validating AI outputs in regulated environments
  • In-depth hands-on experience in a variety of SRE tools (Azure Automation, Catchpoint, Prometheus, Splunk, Grafana)
  • Familiarity with containerization technologies such as Docker
  • Hands-on experience with DevOps CI/CD tools e.g. Jenkins, Artifactory and Vault
  • Experience with telemetry standardization (OpenTelemetry) and observability data correlation


What's in it for you?

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
  • Leaders who support your development through coaching and managing opportunities
  • Ability to make a difference and lasting impact
  • Work in a dynamic, collaborative, progressive, and high-performing team
  • A world-class training program in financial services
  • Flexible work/life balance options
  • Opportunities to do challenging work


#LI-POST

#TECHPJ

Job Skills
Agile Methodology, Application Infrastructure, Group Problem Solving, IT Automation, IT Monitoring, Operations Support, Production Support, Software Development Life Cycle (SDLC), Software Engineering, Software Product Technical Knowledge, System Applications, Systems Software

Additional Job Details

Address:

MEADOWVALE BUSINESS PARK, 6880 FINANCIAL DR:MISSISSAUGA

City:

Mississauga

Country:

Canada

Work hours/week:

37.5

Employment Type:

Full time

Platform:

TECHNOLOGY AND OPERATIONS

Job Type:

Regular

Pay Type:

Salaried

Posted Date:

2026-06-18

Application Deadline:

2026-07-17
Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above

RBC is presently inviting candidates to apply for this existing vacancy. Applying to this posting allows you to express your interest in this current career opportunity at RBC. Qualified applicants may be contacted to review their resume in more detail.

About Royal Bank of Canada

Royal Bank of Canada Careers

Join the dynamic team at Royal Bank of Canada (RBC), a global leader in financial services and a company committed to excellence and innovation. At RBC, we offer a wide range of job opportunities that empower professionals to shape their career paths with leadership, diversity training, and continuous growth.

Work You’ll Do

At Royal Bank of Canada, we are not just hiring; we are building a culture of innovation and leadership. Our team members are at the forefront of the financial industry, driving transformation and delivering targeted solutions that meet the evolving needs of our clients and communities.

Explore Job Opportunities and Employment at RBC

Whether you are starting your career or looking to take it to the next level, RBC offers positions that challenge your skills and fuel your ambition. From entry-level positions to leadership roles, our job opportunities span across various functions and regions. Join us and be part of a team that values professional growth and diversity.

Internship and Professional Development

Kickstart your career with an internship at Royal Bank of Canada. Our internships provide invaluable hands-on experience, networking opportunities, and insights into the financial services industry. Interns at RBC gain the skills necessary to excel and are often considered for full-time positions within the company.

Benefits and Culture

At RBC, we prioritize the well-being and satisfaction of our employees. Our benefits package is designed to support our team members at every stage of their life and career. RBC’s culture is built on a foundation of respect, integrity, and responsibility, fostering an environment where everyone can thrive.

Career Growth and Innovation

We believe in nurturing the potential of our employees through continuous learning and career development programs. At RBC, you will find endless opportunities to grow professionally through on-the-job experiences, formal training programs, and leadership development initiatives. Our commitment to innovation means we are constantly seeking out new ideas and perspectives, making RBC a perfect place for those who aim to lead and innovate.

Diversity and Inclusion

Diversity is our strength. At Royal Bank of Canada, we are committed to building an inclusive workplace where every employee feels valued and respected. Our diversity training programs are designed to educate and inspire, creating a more inclusive and equitable workplace.

Join Our Team

Search open positions that match your skills and interests. We look for passionate, curious, creative, and solution-driven team players. Start your journey with RBC today and be part of a world-class team known for its commitment to client service, community involvement, and innovation.

Stay Connected

Keep up to date with career tips, insider perspectives, and industry-leading insights you can put to use today—all from the people who work here at Royal Bank of Canada.

Job Alert Emails

Personalize your subscription to receive job alerts, latest news, and insider tips tailored to your preferences. Discover the exciting and rewarding career opportunities awaiting you at RBC. Explore the possibilities with Royal Bank of Canada, where your future is filled with potential and the path to success is paved with countless opportunities for professional and personal growth. Join us and shape not just your career but the future of the financial industry.
Learn more about Royal Bank of Canada
Size
86,007 employees
Market Cap
$130.3 billion
Industry
5 Year Trend
+8.7%
NASDAQ

Similar Jobs

More Jobs at Royal Bank of Canada

More Finance & Insurance Jobs

Find similar Senior Site Reliability Engineer jobs: