Senior DevOps & Site Reliability Engineer - Americas

Appspace

• $100K — $130K *

Toronto, ON M3C 0E3In-Person

Information Technology

5 - 7 years of experience

2 weeks ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

6+ years in DevOps or SRE roles with experience in cloud environments.
Expertise in Microsoft Azure and/or Google Cloud Platform.
Proficient in PowerShell and Python with hands-on Bicep or Terraform experience.
Strong knowledge of Windows/Linux Server OS and Kubernetes.
Familiar with middleware and PaaS technologies like CosmosDB and MongoDB.
Excellent troubleshooting skills for complex workflows.

Responsibilities

Identify and automate manual 'toil' tasks in monitoring and administration.
Lead the integration of AI tools for enhanced operational efficiency.
Design and maintain self-service CI/CD pipelines using Infrastructure as Code.
Evaluate platform components for cost-effective automation or migration.
Manage a comprehensive observability stack across cloud platforms.
Collaborate with cross-functional teams to ensure feature reliability and security.
Analyze complex performance defects through root cause analysis.

Benefits

Generous PTO and paid company holidays.
Flexible work schedules with remote opportunities.
5 additional training days off.
Gym membership reimbursement and mental health resources.
Appspace Quiet Fridays for minimal internal meetings.
Fully paid maternity and parental leave program.

Full Job Description

Your Role as a Senior DevOps & Site Reliability Engineer:

Our Cloud Operations team is seeking a Senior DevOps & Site Reliability Engineer who will play a critical role in ensuring the reliability, performance, and scalability of our diverse SaaS applications. You are a problem-solver and an automator at heart. This role is a specialized hybrid, bridging the gap between legacy VM-based architectures and modern cloud-native standards through aggressive automation and development-focused operations.

Unlike a traditional SRE, this role is deeply integrated with the software development lifecycle, focusing on the consolidation and optimization of platform operations. You will be responsible for building the CI/CD frameworks, self-service tools, and AI-driven automation that allow our engineering teams to move faster while maintaining rock-solid stability. Your mission is to maximize the ROI of our existing infrastructure by "automating away" manual toil. On-call coverage will be required on a weekly rotation basis

A Day in the Life of a Senior DevOps & Site Reliability Engineer:

In this role, you will be the technical anchor for a global platform footprint that includes a mix of Azure IaaS/PaaS, Google Cloud Platform (GCP), Kubernetes, and various data platforms. Your day will consist of:

Intelligent Automation & DevOps: Identifying manual "toil" and replacing it with automated workflows for monitoring, change management, and routine administration of large-scale VM environments to ensure a positive ROI.
AI-Enhanced Operations: Leading the integration of AI tools for automated code reviews, development frameworks, and predictive log analysis to drive departmental velocity and efficiency.
Scalable CI/CD & Provisioning: Designing and maintaining "self-service" deployment frameworks and CI/CD pipelines (GitHub Actions, Bamboo) using Infrastructure as Code (Bicep, Terraform).
Strategic ROI Projects: Evaluating platform components to determine the most cost-effective path: automating the current state or migrating features to modern, shared architectures.
Unified Observability: Designing and maintaining a comprehensive observability stack across Azure and GCP (metrics, logs, traces) to identify performance bottlenecks and proactively address system defects.
Cross-Functional Collaboration: Partner with engineering, security and operations teams to ensure new features are "born" with reliability, security and automated delivery in mind; Ensure adherence to security best practices and compliance standards (SOC2, HIPAA, ISO 27001) and operational excellence with cost efficiency.
Root Cause Analysis & Forensics: Investigating complex performance defects by following log trails across web, application, and database tiers (SQL Server, MongoDB, MySQL).
Governance & Security: Ensuring all platforms meet security standards (SOC2, HIPAA, ISO 27001) through automated policy enforcement across Azure and GCP.

What You'll Need:

Must have a passion for life-long learning.
6+ years in DevOps or SRE roles, with a proven track record of bridging development and operations in complex cloud environments
Extensive experience with Microsoft Azure (IaaS, PaaS, App Services, Networking) and/or Google Cloud Platform (GCP).
Expert-level PowerShell and Python skills. Hands-on experience with Bicep or Terraform is required
Strong background in Windows/Linux Server OS, Kubernetes (AKS/GKE), Helm, and container orchestration
Familiarity with various middleware and PaaS technologies (e.g. Event Hub, Service Bus, CosmosDB, RabbitMQ, MongoDB, etc.)
Expert-level troubleshooting and the ability to reason through complex process workflows to identify faults in large-scale platform environments.

Nice to Haves:

Experience with Atlassian suite (Jira, Confluence, Bitbucket).
Experience with AI-driven log analysis or automated incident remediation.
Knowledge of database tuning (SQL Server, MySQL, MongoDB).
Familiarity with compliance standards (SOC2, HIPAA, GDPR).

The Perks of Working for Appspace:

For all our Canadian based team members, we offer a variety of benefits from competitive salaries, medical, dental and vision coverage, ongoing training opportunities, gym membership reimbursement, mental health resources, and a fully paid maternity and parental leave program.

Generous PTO
5 additional days off for training
Flexible work schedules
Remote work opportunities
Appspace Quiet Fridays (No non-essential internal meetings scheduled)
Paid company holidays

* Ladders Estimates

Similar Jobs

Cloud Site Reliability Engineer
$90K — $120K *
Imagine Communications (fka Harris Broadcast)
Waterloo, ON N2J 1A1
Today
Cloud Site Reliability Engineer
$100K — $130K *
Imagine Communications (fka Harris Broadcast)
Toronto, ON M3C 0E3
Yesterday
Site Reliability Engineer III (SRE III)
$100K — $130K *
Emburse
Toronto, ON M3C 0E3
Reposted 1 week ago
Collateral Management Maintenance and Support Lead, AVP
$100K — $160K *
State Street Corporation
Toronto, ON M3C 0E3
Reposted 1 week ago
Lead Site Reliability Engineer - Remote
$120K — $150K *
CentralSquare
Remote
2 weeks ago
Lead Engineer, DevOps & SRE
$120K — $150K *
Launch Potato
Remote
2 weeks ago

Get Ready For Your
Next Interview

More Jobs at Appspace

Cloud Security Engineer
$120K — $150K *
Remote
Reposted 4 days ago
Enterprise Technology
Remote in Dallas, TX
Senior DevOps & Site Reliability Engineer - Americas
$100K — $130K *
Toronto, ON M3C 0E3
2 weeks ago
Information Technology
In-Person
Implementation Specialist
$70K — $95K *
Remote
3 weeks ago
Information Technology
Remote in Toronto, ON
Implementation Specialist
$70K — $95K *
Toronto, ON M3C 0E3
3 weeks ago
Enterprise Technology
In-Person
Sales Enablement Manager
$90K — $130K *
Remote
1 month ago
Enterprise Technology
Remote in Tampa, FL

More Information Technology Jobs

Business Development Director
$300K — $345K + $120K bonus *
Tier1 IT Services Firm
Kansas City, MO 64116 (Clay County)
6 days ago
Client Partner / Business Developemnt - Banking
$250K — $320K + $70K bonus *
IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
6 days ago
Customer Support
Confidential Company
Austin, TX 78701 (Travis County)
2 weeks ago
Sr Assoc, Cyber Sec ThreatMgmt - Detection Engineer
$88K — $151K *
Northern Trust
Naperville, IL 60540 (Dupage County)
Today
Global Director – Vulnerability Management & Security Configuration
$164K — $288K *
Northern Trust
Chicago, IL 60629 (Cook County)
Today

Find similar Senior DevOps & Site Reliability Engineer - Americas jobs:

Nationwide Toronto, ON

Senior DevOps & Site Reliability Engineer - Americas

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior DevOps & Site Reliability Engineer - Americas jobs:

Get Ready For Your
Next Interview