DevOps - Site Reliability Engineer


Toronto, ON

5 - 7 years

Posted 242 days ago

This job is no longer available.

Job Description

Are you passionate about technology and interested in changing the way digital products are built. Our client is developing a new platform where developers and engineers across their company can quickly and securely deploy applications into the public cloud. 

Combining aspects from DevOps, SysAdmin, and Test Engineer, the role of Site Reliability Engineer will allow you to have the opportunity to combine your technical ability, strategic thinking and provide detail-oriented execution in a fast-paced, dynamic environment.

You will join a new team with the purpose of transforming both development and operations, having full access to fix and strengthen code, designing systems that validate and run code from other teams, while designing tools that monitor the state of current systems. Our client has a strong focus on being people first and promotes support and training on both an individual and a team level through partners like Microsoft and Google who you'll have the ability to work directly along side of in this role!

If you'd be interested to know more about about this opportunity, apply within today!

See below for more details:

What you will be able to do in this role:

-Manage a critical platform that is expected to run hundreds of applications
-Improve and maintain site availability, scalability, service and system performance
-Investigate system errors and problems, bottleneck analysis of the system at scale, etc.
-Setup monitoring systems and application metrics as well as supervise them for prediction/detection of failures
-Design and develop software in code testing automation and code deployment
-Provide solutions for performance management, disaster recovery, monitoring and access management

Skill you already have coming into this role:

-5+ years of relevant working experience and at least 3 years in a DevOps / Site Reliability Engineer role
-Excellent knowledge and experience in Software Engineering, System Administration, and Operations
-Experience developing in any of the following languages (Java, Javascript, C#, Python, Go)
-Understanding of Unix/Linux systems from kernel to shell and beyond, including internal Unix systems and networking (DNS, TCP/IP, UDP, etc)
-Experience designing and implementing tasks in Continuous Integration systems (Jenkins, Travis, CircleCI, etc.)
-Strong grasp of security, privacy and monitoring concepts
-Experience supporting containers, container orchestration platforms
-Experience operating applications on public and private cloud solutions
-Experience with running large scale systems and meeting SLA expectations
-Strong sense of project ownership and team responsibility
-Excellent project management skills and the ability to work in a fast-paced and hectic work environment
-Solid verbal and written communication skills

If you have experience with any of the following, this would be great !

-Experience with Azure, AWS, or GCP
-Excellent knowledge of network engineering
-Well versed in database management