info_outline
X This posting is for an existing vacancy.
Google utilizes AI tools to assist in assessing candidates in our hiring processes.
Minimum qualifications: - Bachelor's degree in Computer Science, a related technical field, or equivalent practical experience.
- 5 years of experience including product demand/supply planning, and production and inventory management.
- 3 years of experience working with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls) and networking (e.g., TCP/IP, routing, network topologies and hardware, software defined networking).
- Experience programming in at least one of the following languages: C, C , Java, Python, or Go.
- Experience with computer networking (e.g., DNS, Load Balancing, routing) and Linux/Unix system administration.
Preferred qualifications: - Master's degree in Computer Science or a related technical field.
About the jobSite Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our internally critical and our externally-visible systems-have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally SRE's will keep an ever-watchful eye on our systems capacity and performance.
Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow.
As a Staff Site Reliability Developer, you will serve as a technical anchor for the Protected Data SRE team in Waterloo. You will have technical leadership across complex systems, cross-functional alignment, and systemic risk management.
Canada: $216000 - $222000 (CAD) 20% bonus target equity benefits
Responsibilities - Drive the strategy to reduce complexity ecosystem-wide, focusing on solution and component reuse to prevent new production risks.
- Partner with executive developing stakeholders and cross-functional programs to balance product reliability against regulatory deadlines.
- Design company-wide capabilities for change safety, distributed observability, large-scale data repair, and control plane safety.
- Provide technical direction and mentorship to developers in Waterloo, fostering a culture that collaborates across infrastructure stacks (from Spanner to Google Front End (GFE)).