WHAT WE'RE LOOKING FOR
We're looking for a DevOps engineer to help us grow and improve automation, site reliability, and enable our engineering team to use new technologies in a scalable, reliable, and highly available way. Braze operates at massive scale, collecting hundreds of billions of data points each month and sending hundreds of millions of messages to end-users daily. We use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, and Hadoop, but we are adding new technologies and paradigms such as Kafka and distributed stream processing. We have infrastructure in a half dozen different data centers all managed by Chef, and a lot of caffeine.
We are seeking a strong operations owner who has a drive to automate anything that can be automated, at Braze scale. This individual will be directly responsible for an entire cluster of application nodes, sending services and the databases that support it.
WHAT YOU'LL DO
- Become an expert in how our product works from a technical standpoint
- Ensure that our message sending, big data pipeline and ingestion platforms can run and scale effortlessly
- Identify and resolve critical problems throughout the platform, which includes application performance.
- Develop Build and Deploy pipelines for applications in multiple languages using both docker and virtual instances
- Split on-call duty with other engineers on the team
- Ensure that Braze meets our strict enterprise-grade SLAs with customers
- Implement security changes, protocols, etc to ensure Braze meets our strict customer privacy promise.
WHO YOU ARE
- 3-5+ years of experience as a devops engineer/systems administrator/site reliability engineer
- Working knowledge of Python or Ruby
- In depth experience with Redis
- Previous experience using configuration management platforms to build and manage hundreds of servers
- Previous experience with Hadoop or Kafka
- Excellent communication skills
- Excellent ability to manage multiple tasks and expectations at once
- Previous work at startups
- Experience handling rapid user growth or global expansion