Senior Database Site Reliability Engineer

TherapyNotes.com

$120K — $160K *
US-AnywhereRemote in United States
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • BS degree in Information Systems, Engineering, or equivalent experience
  • 7-10+ years of Engineering experience, specifically with Database Engineering, Systems Engineering, DevOps, or SRE
  • Experience with cloud solutions, particularly Azure and Kubernetes
  • Proficiency operating PostgreSQL in a Linux environment
  • Expertise with observability/monitoring platforms like Prometheus/Grafana or Datadog
  • Experience in Agile/DevOps environments and familiarity with ITSM practices

Responsibilities

  • Design, implement, and maintain high-availability PostgreSQL systems for a 24x7 SaaS platform
  • Define and enhance database service reliability through monitoring and SLO metrics
  • Drive incident response, root cause analysis, and corrective actions for database issues
  • Collaborate with technical leaders to ensure new systems are supportable
  • Provide escalated technical support and guidance to other technology teams
  • Manage on-call coverage for production support as needed
  • Continuously improve database observability with actionable dashboards and monitoring tools
  • Automate system maintenance tasks with scripting and manage IaC with Ansible

Benefits

  • Employer sponsored health, dental, vision, life, and disability insurance
  • Retirement plan with company contribution
  • Annual company profit sharing
  • Personal development/training budget
  • Open, collaborative work environment
  • Extensive 2-week onboarding plan
  • Comprehensive mentorship program
Full Job Description
About the Position

We are seeking a Database Site Reliability Engineer who demonstrates a strong skill set in managing PostgreSQL. In this role, you will own the reliability and operability of our PostgreSQL services supporting a growing 24x7 SaaS platform, with an emphasis on availability, performance, observability, incident response, and automation. You will partner with cross-functional teams-including developers, operations, and infrastructure-to ensure that our database services run smoothly and efficiently. If you are passionate about operational excellence and continuous improvement, we want to hear from you.

What You'll Do
  • Responsible to design, implement, and maintain high-availability, high throughput, data and compute intensive, critical database systems running PostgreSQL which supports a growing 24x7 SaaS platform.
  • Define and improve database service reliability through monitoring/alerting, SLO-oriented metrics, and operational readiness.
  • Participate in and help drive incident response, root cause analysis, and post-incident corrective actions for database-related production events.
  • Partner with other technical leaders to ensure all newly introduced systems are supportable and maintainable by both development and operations.
  • Provides escalated technical guidance and support to other technology teams throughout the organization
  • Provides on-call coverage for production support and other duties as required.
  • Accountable for complying with HIPAA security policies within the database platform
  • Ensure all solutions and operational activities adhere to the security and operating policies established by the organization
  • Own and continuously improve our Datadog database observability by building actionable dashboards, alerts, and service-level views using an observability stack (e.g., Prometheus, Grafana, New Relic, or equivalent). Familiarity with PGAnalyze or Percona a plus.
  • Automate system maintenance tasks using Bash, Powershell, Python, or Ansible. Manage infrastructure as code (IaC) writing Ansible playbooks. Some exposure to Terraform a plus.
  • Experience with writing & designing ETL pipelines using Python a plus
  • Understand and maintain various PostgreSQL ecosystem components like: PgBouncer, PgBackrest, HaProxy, RepMgr a plus
  • Excellent communication and interpersonal skills.

What We're Looking For
  • BS degree in Information Systems, Engineering, or equivalent experience
  • 7-10+ years of Engineering experience with Database Engineering, Systems Engineering, DevOps and/or SRE
  • Experience in cloud-based compute, storage, and containerization solutions (Azure & Kubernetes preferred)
  • Proficiency with operating PostgreSQL in a Linux environment is a plus
  • Expertise with an observability/monitoring platform (e.g., Prometheus/Grafana, New Relic, Datadog, or equivalent); Datadog experience is a plus.
  • Experience working in Agile/DevOps environments and operating production services with ITSM practices where applicable

What We Offer
  • Competitive salary - $120,000-160,000
  • Employer sponsored health, dental, vision, life, and disability insurance
  • Retirement plan with company contribution
  • Annual company profit sharing
  • Personal development/training budget
  • Open, collaborative work environment
  • Extensive 2-week onboarding plan
  • Comprehensive mentorship program


#LI-Remote
#LI-RH1
6/2/2026

Similar Jobs

More Jobs at TherapyNotes.com

More Information Technology Jobs

Find similar Senior Database Site Reliability Engineer jobs: