About the Site Reliability Engineering TeamThumbtack's Site Reliability Engineering team focuses on creating and maintaining a reliable, secure, and scalable platform vital for a seamless user experience. As a key contributor, you will design and support resilient systems, prioritizing high performance, availability, and throughput, with a focus on minimizing service disruptions, downtime, and latency. SRE impacts Thumbtack's ecosystem across the entire stack, from linux systems to applications that drive the customer experience. Our work is high leverage impacting how Engineering, Applied Science, and many other teams deliver, run, and observe systems.
The challengeThe Site Reliability team is responsible for a broad set of technologies and systems with expectations to collaborate across the business. We are expected to develop and enhance existing capabilities while ensuring scalability, reliability and resiliency of infrastructure and software. You'll work with engineering teams ranging from product development, developer experience, and backend infrastructure to collaboratively build Thumbtack's ecosystem of platform services that have the right impact at the right time. Thumbtack values its cross functional collaborative culture, and you'd be positioned to contribute to the future direction and success of the engineering platform that serves as the engine of our applications.
What you'll do- Design, create, and maintain software and systems to improve the availability, scalability, and efficiency of Thumbtack's services
- Set the architectural direction of infrastructure and platform services while supporting the engineering organization
- Design and implement tools and processes used for deployment, change, service, and infrastructure management
- Troubleshoot and debug critical systems throughout the SDLC
- Contribute to the evolution and performance of capabilities we provide to engineering as a platform organization
- Capacity planning and demand forecasting, anticipating performance bottlenecks
- Participate in rotating on-call duties
In order to be successful, you must bring- Extensive fluency in AWS and Linux
- Ability to effectively read, write, and debug code in programming languages like but not limited to: Python, Go, PHP, Javascript
- Expertise in designing, analyzing, and troubleshooting large-scale distributed systems across web technologies like: DNS, TLS, HTTP/S, TCP/IP
- Ability to decompose complex problems while understanding the tradeoffs necessary to deliver impact
- Demonstrable knowledge of instrumenting, operating, and observing a distributed system of microservices in a production cloud environment
- Ability to communicate clearly and effectively to cross functional partners of various technical levels
- Passion for reducing toil and improving developer experience
Expected salary ranges- For candidates living in San Francisco / Bay Area, San Jose, New York City, or Seattle metros, the expected salary range for the role is currently $177,700.00 - $229,900.00
- For candidates living in Austin, TX or Washington DC metros or in California, Massachusetts, New Jersey, or Washington states, the expected salary range for the role is currently $159,800.00 - $206,800.00
- For candidates living in all other US locations, the expected salary range for this role is currently $151,300.00 - $195,800.00
Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role
Actual offered salaries will vary and will be based on various factors, such as calibrated job level, qualifications, skills, competencies, and proficiency for the role.