SRE, Site Reliability Engineer

Iterable   •  

Omaha, NE

Industry: Technology

  •  

5 - 7 years

Posted 39 days ago

Our Culture   

The San Francisco Chronicle voted us one of the top 20 small businesses to work for in 2018!

A few things to know about us:

  • We have egalitarian and transparent values
  • We are open, inclusive and empathetic people
  • We are all very focused on self-improvement

Culture is very important at Iterable, and we value working alongside driven people who share our values of:

  • Humility
  • Trust
  • Balance
  • Growth Mindset

 We're happy to elaborate on what these mean to us, and how these values might help foster your growth; personally and professionally. Regarding "Balance" specifically - we believe in a balanced approach to work and have unlimited vacation, as well as a quarterly stipend for personal or professional growth. We are proud to celebrate diversity at Iterable. We know that the more diverse our teams are, the better we will be as an organization. So far this year, we've celebrated the Lunar New Year with a Chinese banquet, sponsored a team visit to the Museum of the African Diaspora for Black History Month, and hosted a Woman in Leadership Q&A panel for Women's History Month. Upcoming events include an international potluck lunch and attending a San Francisco Giants baseball game on LGBTQ Night. We believe in a strong, friendly engineering culture and are excited to hear more about what is important to you in your next position. Who is Iterable?

  • Iterable is a growth marketing and user engagement SaaS platform that B2C marketers use to send the right message, to the right device, at the right time. 
  • Our product team built the growth systems that powered Twitter's early success and we're capitalizing on tremendous successes in recent years. We are growing rapidly! 
  • We've raised over $30M from investors like Index Ventures and CRV. Hundreds of companies like AT&T, MLB, Zillow, Box, Foursquare, and ASICS rely on us to captivate their many millions of users.
  • Recently, we launched two important new products -- Iterable Artificial Intelligence Suite and Iterable Insights. 
  • We can't wait to share more with you about what we are working on, why it's important, and where we are headed.

 Can I work remotely?

  • Yes! This is a remote SRE role on a brand new team.
  • You can be based anywhere in the US, as long as you are able to work our core hours and interface regularly with your team. 
  • Ideally, you should have experience working remotely in a previous role in order to be considered.

 The Role

  • One of our explicit goals as a team is to build a uniquely fun and growth-oriented culture. Our team of talented engineers and thinkers is small, lean, empathetic, and balanced.
  • On our 25-person engineering team, you'll ship features on your first day here.
  • We serve large enterprise customers and keeping our platform highly reliable, running at high performance, and secure is extremely important.
  • As the Site Reliability Engineer (SRE), you will provide the expertise on how to build, test, monitor, and deploy scalable applications.
  • You will develop, define, and document what the standards are for production, operations, and our growing resilient infrastructure.

 Hard Skills / Experience

  • Expert knowledge of Scala and/or Java
  • Advanced knowledge of AngularJS
  •  Expert knowledge of JavaScript, HTML, and/or CSS
  •  Experience with server-side MVC web frameworks
  •  Experience with the AWS stack
  •  Experience with Elasticsearch

 Core Responsibilities

  • This is a software engineering role (60% of your time) where you will focus on automation and improving production and operations resilience
  • Design and develop improvements, focused on resilience, to our production systems to achieve and surpass SLOs
  • Designing and implementing monitoring and logging systems at scale
  • Evolving systems towards improved reliability and scalability
  • Join our team as an Incident Commander (with on-call responsibility) and act to ensure the right people are in the right place at the right time
  • Facilitate blameless Incident Retrospectives to understand root causes, communicate learnings, determine remediation and make us better and closer as a team.
  • Help improve our operational practices to minimize service disruptions
  • Track and communicate operational metrics surrounding incidents and the health of our systems
  • Serve as an important point of contact for our customers to ensure they understand our efforts on their behalf
  • Instrumenting production systems, collecting metrics, and improving observability
  • Troubleshooting application, network, and database performance issues
  • Developing process, procedure and reporting on systems and team health metrics

 Requirements

  • Five years experience developing software focused on systems and operational automation in a large-scale distributed environment
  • Experience scaling complex systems for operational resiliency
  • Recent experience with algorithms, complexity analysis, and software design
  • Excellent communication skills
  • Passion for learning and always improving yourself and the team around you
  • Deep familiarity with cloud infrastructure on AWS or similar

 Iterable is proud to be an equal opportunity employer and strives to build a diverse and inclusive team. We do not discriminate on the basis of race, color, national origin, religion, gender, sexual orientation, age, marital status, veteran status, or disability status.

$100K - $160K
$100K - $160K base, Equity bonus