Intact Financial Corporation

Resiliency Architect

Intact Financial Corporation$149K — $182K *
Technical Services
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10+ years in SRE/Platform/Infrastructure/Systems Architecture with large-scale experience across cloud environments.
  • Proficient in Kubernetes ecosystems and related autoscaling strategies.
  • Experience with observability tools and stacks for performance monitoring.
  • Expertise in data resilience and consensus algorithms.
  • Knowledge of Infrastructure as Code (IaC) tools and CI/CD patterns.
  • Bilingualism required for Quebec-based candidates due to team interactions across Canada.
  • No Canadian work experience necessary, but must be eligible to work in Canada.

Responsibilities

  • Establish a comprehensive resiliency architecture and enforce non-functional requirements across cloud platforms.
  • Conduct design reviews and production readiness assessments to govern quality.
  • Implement blue/green deployments and chaos engineering methodologies at enterprise scale.
  • Integrate AI technologies into reliability practices for enhanced system performance.
  • Oversee disaster recovery and cyber-resilience strategies aligned with regulatory needs.
  • Drive observability initiatives and SRE practices across the organization.
  • Lead the evolution of corporate policies and automations in CI/CD and infrastructure management.

Benefits

  • Annual bonus plan tied to personal and company performance, with potential payouts of up to double the target.
  • Employee Share Purchase Plan (ESPP) with company matching for strategic financial investment.
  • Robust pension offerings, including a defined benefit plan for guaranteed lifelong income.
  • Support for work-life balance with a hybrid working model.
Full Job Description
Salary range (but not limited to): 149,600 - 182,800 Annual bonus target, based on the base salary, with a potential payout of up to double the target (subject to personal and company performance): 15% As part of our commitment to Win As A Team, we share our success with employees through our annual bonus plan and Employee Share Purchase Plan (ESPP) - with Intact matching 50% of your net shares. Our pension offerings provide flexibility and long-term security for our employees beyond their careers. We are one of the few companies offering the opportunity to receive guaranteed income for life via our defined benefit pension plan. Salary for the candidate will be determined taking into consideration a number of factors including: experience, skills, qualifications, anticipated contribution to role, internal equity, etc. The salary range presented above is based on a 35-hour workweek and would represent a majority of different candidate profiles. However, we encourage candidates who may fall outside of this range to apply as well. About the role We are seeking a Resiliency Architect to define and drive our end-to-end resiliency architecture and production reliability posture across Azure, AWS, Google Cloud, and on-prem environments. This person will be responsible to design standards, production readiness, and enforcement mechanisms at enterprise scale. The ideal candidate combines deep SRE expertise with advanced systems architecture and a strong vision for explicit blue/green and chaos engineering practices-alongside AI/GenAI-to make systems reliable, leverage AI as a force multiplier for resiliency, transform team workflows, and deliver resilient, intelligent user solutions. What you'll do here: Core objectives :
  • Establish the enterprise resiliency architecture, patterns, and production guardrails for all critical platforms and services.
  • Govern design quality through rigorous architecture reviews and production readiness assessments.
  • Make blue/green deployments and chaos engineering first-class, codified practices across the estate: design, tooling, automation, and continuous validation.
  • Integrate AI/GenAI into reliability engineering: robust AI system architectures, AI-assisted observability, causal detection, and autonomous remediation.
  • Lead the evolution of disaster recovery, ransomware protection, and continuity strategies grounded in hard SLAs/SLOs and measurable business outcomes.
Key responsabilities
  • Own the resiliency reference architecture for multi-cloud/hybrid (multi-region/zone, active-active/passive, blast-radius reduction) and define/enforce NFRs (availability, latency, durability, RTO/RPO).
  • Establish governance via design reviews, production gates, policy-as-code, scorecards, and automated controls integrated with CI/CD, IaC, and runtime platforms.
  • Standardize blue/green deployment architecture and engineer safe traffic shifting, health gates, progressive cutovers, rollback, and zero-downtime data migrations.
  • Lead an enterprise chaos engineering program (experiments, failure injection, game days) and feed outcomes back into architecture guardrails and SLO improvements.
  • Define production readiness standards (capacity/saturation, graceful degradation, retries/backoff, circuit breakers, rate limiting) and codify runbooks, dependency maps, and failover topologies validated via DR drills and rehearsals.
  • Drive observability and SRE practices: OpenTelemetry adoption, distributed tracing, SLIs/SLOs/SLAs, error budgets, and executive reliability dashboards.
  • Architect DR and cyber-resilience (immutable/air-gapped backups, PITR, ransomware-resistant segmentation, recovery validation) aligned to regulatory and audit needs.
  • Guide platform and data resiliency across Kubernetes/service mesh, replication/consensus, geo-distribution, and event streaming (DLQs, backpressure, reprocessing).
  • Enable reliable AI/GenAI systems and AI-driven operations (monitoring/guardrails, anomaly detection, predictive modeling, human-in-the-loop remediation, ops copilots).
  • Serve as principal resilience authority: mentor teams, lead councils/forums, and communicate tradeoffs clearly to executives and engineers.
What you bring to the table:
  • 10+ years in SRE/Platform/Infrastructure/Systems Architecture with proven large-scale, production-critical experience across Azure, AWS, GCP, and on-prem.
  • Multi-region traffic management, global load balancing, DNS/BGP, TLS/mTLS, CDN/edge patterns.
  • Kubernetes ecosystems (AKS/EKS/GKE), service meshes (Istio/Linkerd), autoscaling strategies, readiness/liveness, topology constraints.
  • Observability stacks: OpenTelemetry, Prometheus/Grafana, Jaeger/Tempo, ELK/OpenSearch, commercial APM; correlation and topology modeling.
  • Data resilience: consensus/replication (Raft/Paxos), partitioning, PITR, snapshots, CDC; caches (Redis), databases (Aurora, Cosmos DB, Spanner).
  • IaC and automation: Terraform/Pulumi, GitOps (Argo CD/Flux), policy-as-code (OPA), CI/CD patterns (blue/green, canary, progressive delivery).
  • Chaos engineering, DR orchestration, and automated failover at enterprise scale.
  • For candidates located in Quebec, bilingualism is required considering the necessity to interact on a regular basis with English speaking colleagues across the country.
  • No Canadian work experience required however must be eligible to work in Canada
AI/GenAI competencies:
  • Architecting reliable AI systems: model serving (Ray/SageMaker/Vertex), vector stores (Pinecone/FAISS/pgvector), retrieval pipelines, guardrails and safety.
  • ML/ops: model monitoring (drift, performance, hallucination detection), feature pipelines, lineage/observability, prompt/content governance.
  • Applying AI to operations: causal detection, predictive resiliency, autonomous remediation frameworks.
  • Strong software engineering skills (Go/Python/TypeScript) and systems thinking; excellent communication (written, visual, verbal) and executive presence.
#LI-Hybrid Il s'agit d'un nouveau rôle au sein de notre équipe en plein croissance | This role is a new member of our growing team.

About Intact Financial Corporation

Intact Financial Corporation is a Canadian insurance company that provides property and casualty insurance to individuals and businesses. The company operates in Canada and the United States and offers a range of insurance products, including auto, home, and commercial insurance. Intact Financial Corporation was founded in 1809 and is headquartered in Toronto, Canada.
Learn more about Intact Financial Corporation
Size
16,000 employees
Industry

Similar Jobs

More Jobs at Intact Financial Corporation

More Technical Services Jobs

Find similar Resiliency Architect jobs: