Stellar Development Foundation

Director of Site Reliability Engineering

Stellar Development Foundation$210K — $310K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 3+ years of experience as a Site Reliability Engineer
  • 3+ years of experience managing an SRE team
  • Strong collaboration with development teams during product development
  • Proven ability to define and drive KPIs for engineering teams
  • Experience in troubleshooting and conducting root cause analysis
  • Expertise in infrastructure for large distributed systems
  • Familiarity with configuration management and IaC tools like Terraform or Ansible
  • Hands-on experience with Kubernetes and maintaining high-availability systems
  • Highly autonomous and effective communicator in remote settings

Responsibilities

  • Establish a vision and mandate for the SRE team
  • Define quarterly OKRs aligned with company goals
  • Facilitate collaboration between SREs and development teams across the software lifecycle
  • Coach and mentor SRE team members on career growth
  • Track metrics and hold engineering teams accountable for KPIs
  • Coordinate priorities and collaboration with other teams
  • Participate in sprint planning and daily tactical decision oversight
  • Design and build user-friendly and reliable systems and infrastructure
  • Monitor and troubleshoot production systems
  • Define and participate in 24/7 on-call rotations with the team
  • Mediate technical discussions and review pull requests
  • Engage with partners in the Stellar ecosystem for successful integrations

Benefits

  • Comprehensive health, dental, and vision insurance with most plans at 100% coverage for employees and dependents
  • Flexible time off plus 15 company holidays and a company-wide holiday break
  • Generous paid parental leave and pregnancy disability leave
  • $80/month gym reimbursement
  • Life insurance and Accidental Death and Dismemberment (up to $50K)
  • Short and long-term disability coverage
  • 401K with a 4% employer match
  • Health and Dependent Care FSA accounts
  • $250/month commuter benefits contribution
  • Health Savings Account (HSA) with employer contributions
  • Family building benefits through Kindbody
  • Wellbeing benefits including One Medical and Headspace
  • Annual learning and development budget of $1,500
  • Daily lunch and snacks provided in the office
  • Company retreats and team-building events
Full Job Description
You will lead an experienced Site Reliability Engineering team, ensuring our services and tooling are available, building infrastructure to make our team's production and testing environments available, and greasing the rails of our systems and processes to ensure they're robust, efficient, and easy to deploy.

SDF has a robust career path for both individual contributors and managers.

In this role, you will:
  • Establish a clear vision and mandate for the Site Reliability Engineering team
  • Define the SRE team's quarterly OKRs to best align with the company's goals
  • Define processes of collaboration between SREs and development teams throughout the software development lifecycle
  • Define a career growth path for the SRE team, as well as coach and mentor individual contributors on the team
  • Define and track metrics across engineering and help hold engineering teams accountable for their KPIs
  • Coordinate priorities with other teams and areas of the organization
  • Participate in sprint planning and execution, track progress and oversee day-to-day tactical decisions
  • Design and build reliable systems, and infrastructure that is easy to use by software engineers
  • Monitor and troubleshoot systems in production
  • Define and participate in 24/7 on-call rotations alongside the team
  • Mediate technical discussions and review PRs
  • Jump in as needed with code fixes, troubleshooting and hands-on contributions
  • Collaborate across the Stellar ecosystem, engaging with key partners and advising on their integration to set them up for success


You have:
  • 3+ years of experience working as a Site Reliability Engineer
  • 3+ years of experience managing an SRE team
  • Site Reliability Engineering experience:
    • Strong track record of collaborating with dev teams at all stages of product development (design, development/CI, beta testing, production)
    • Strong track record collaborating on defining, measuring and driving improvements in KPIs
    • Strong track record assisting teams during Root Cause Analysis and post mortems
  • Infrastructure and Operations experience:
    • Designing and building out the infrastructure for large distributed systems
    • Maintaining highly-available infrastructure
    • Troubleshooting and understanding complex technical problems
    • Using configuration Management or IaC tooling such as Terraform, Ansible, Puppet
    • Building and maintaining infrastructure using Kubernetes
  • Highly autonomous; able to find clarity in ambiguous circumstances
  • Excellent communicator; comfortable working with remote team members


Bonus Points if:
  • 3+ years of experience writing code in a major programming language
  • You have worked on an open source project
  • You have managed a distributed team
  • You build things for fun in your spare time


We offer competitive pay with a base salary range for this position of $210,000 - $310,000 depending on job-related knowledge, skills, experience, and location. In addition, we offer lumen-denominated grants along with the following perks and benefits:

USA Benefits/Perks:
  • Competitive health, dental & vision coverage with most plans covered at 100% for the employee + any dependents
  • Flexible time off + 15 company holidays including a company-wide holiday break
  • Generous paid parental leave for all parents, plus paid pregnancy disability leave for birthing parents
  • Gym reimbursement ($80 per month)
  • Life & ADD (up to $50K)
  • Short & Long term disability
  • 401K with 4% match
  • Health & Dependent Care FSA Accounts
  • Commuter benefits with $250/month employer contribution
  • Health Savings Account (HSA) with monthly employer contribution
  • Family building benefits through Kindbody
  • Wellbeing benefits (One Medical, Rightway, Headspace)
  • L&D budget of $1,500/year
  • Daily lunch and snacks in office
  • Company retreats


#LI-Hybrid

About Stellar Development Foundation

Stellar Development Foundation is a non-profit organization that supports the development and growth of the Stellar blockchain network. Stellar is an open-source, decentralized payment protocol that enables fast, low-cost, cross-border transactions. The foundation was established in 2014 by Jed McCaleb, the co-founder of Ripple, and Joyce Kim, a lawyer and entrepreneur. The foundation's mission is to promote financial access and inclusion by making it easier and cheaper to move money around the world. Stellar's network has been used by companies such as IBM and Deloitte to facilitate cross-border payments and remittances.
Learn more about Stellar Development Foundation
Size
100 employees
Industry
Founded
2014

Similar Jobs

More Jobs at Stellar Development Foundation

More Information Technology Jobs

Find similar Director of Site Reliability Engineering jobs: