Senior Site Reliability Engineer

QGenda • $110K — $140K *

Atlanta, GA 30349In-Person

Information Technology

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

B.S. in Computer Science or equivalent industry experience
7+ years in DevOps, SRE or Systems Engineering
Advanced proficiency in at least one scripting or programming language
Hands-on experience with AWS services (EC2, S3, Lambda, etc.)
Strong understanding of networking, DNS, and cloud security best practices
Experience with Docker and container orchestration (ECS, EKS/Kubernetes)
Proficient in CI/CD tools and practices.

Responsibilities

Design and manage scalable and reliable systems for high availability
Continuously monitor system health through data and metrics analysis
Develop automation tools to boost efficiency and reduce manual tasks
Build and enhance CI/CD pipelines for streamlined software delivery
Participate in incident management and minimize downtime
Conduct root cause analyses for recurring issues and implement solutions
Advise engineering teams on best practices for infrastructure and operations.

Benefits

Fully company-paid medical, dental, and vision insurance
Generous paid time off policy for work/life balance
Paid parental leave for major life events
401(k) plan with company matching
Flexible work from home or hybrid model options
Annual Costco membership and various stipends for cell phone and commuting.

Full Job Description

About Your Role

As a Senior Site Reliability Engineer, you will work with our Infrastructure and Product Development Teams to increase the scalability, reliability, and performance of our systems and services. You will build and extend existing automation for configuration and monitoring of our AWS hosted applications. You will have the opportunity to evaluate new AWS services and tools to determine if they could be utilized in our environments. You'll bring a focus to platform health and monitoring to allow us to deliver the best possible experience for our customers. This is an excellent opportunity to have a significant impact on the stability of our systems and contribute to the evolution of our technology stack.

NOTE: This role is hybrid with one required day in our Buckhead (Atlanta) or our Uniontown, Ohio office depending on your current location.
How You'll Make an Impact

As a Senior Site Reliability Engineer, you will work with our Infrastructure and Product Development Teams to increase the scalability, reliability, and performance of our systems and services. You will build and extend existing automation for configuration and monitoring of our AWS hosted applications. You will have the opportunity to evaluate new AWS services and tools to determine if they could be utilized in our environments. You'll bring a focus to platform health and monitoring to allow us to deliver the best possible experience for our customers. This is an excellent opportunity to have a significant impact on the stability of our systems and contribute to the evolution of our technology stack.
Responsibilities:

System Reliability & Performance:

Design, implement, and manage scalable systems that ensure high availability, fault tolerance, and optimal performance.
Continuously monitor and enhance system health and performance through data analysis and metrics.

Automation & Tooling:

Develop and advocate for automation tools to eliminate repetitive manual processes and improve efficiency.
Build and enhance CI/CD pipelines to streamline software delivery and deployments.

Incident Management & Troubleshooting:

Participate in on-call rotation to respond to incidents, troubleshoot problems, and minimize downtime.
Conduct root cause analyses and implement permanent solutions to recurring issues.

Infrastructure Management:

Manage our cloud-based infrastructure environment in AWS.
Optimize costs and resources while maintaining robust and scalable systems.

Collaboration & Culture:

Serve as a technical advisor to engineering teams on infrastructure and operations best practices.
Actively contribute to fostering an SRE culture within the organization by promoting observability, retrospectives, and continuous improvement.

Who You Are

Curiosity-driven mindset with a desire to continuously learn and improve systems
Strong sense of ownership - you see problems through to resolution, not just escalation
Comfortable navigating ambiguity and making pragmatic tradeoffs under pressure
Availability for off-hours deployment and upgrades of production systems during release and maintenance windows
Strong problem-solving skills and ability to work effectively under pressure.
Excellent communication skills for cross-functional collaboration as well as documentation creation.

Experience You Bring

B.S. in Computer Science, Computer Information Systems, or Computer Engineering from a major U.S. university or equivalent industry experience
7+ years of experience as a DevOps, SRE or Systems Engineer
Advanced proficiency with at least one scripting or programming language
Experience with Docker and container orchestration tools such as AWS ECS and EKS/Kubernetes
Hands-on experience building infrastructure and supporting applications in AWS using services such as Lambda, EC2, ECS, S3, SNS, SQS, RDS, Redshift, and Elasticache
Strong understanding of networking and DNS
Strong experience with Terraform for infrastructure provisioning and module development, along with configuration management and infrastructure as code (IaC) practices
Firm understanding and experience with Agile and Scrum SDLC processes
Using distributed version control system experience (Git preferred) to check-in code, branching, merging, pull request, code review, etc
Knowledge of CI/CD best practices and tools such as AWS CodeBuild, Jenkins and/or TeamCity
Experience using AI-assisted coding tools (e.g., Claude, GitHub Copilot) to accelerate IaC development, scripting, and operational workflows
Familiarity with AI/ML-driven approaches to observability, anomaly detection, log analysis, or incident triage
Experience designing and delivering secure, high performance and highly available cloud services
Experience with observability platforms (e.g., Datadog, CloudWatch, PagerDuty) for monitoring, alerting, and incident response
Awareness of cloud security best practices including IAM policies, network segmentation, and secrets management

#LI-Hybrid

Applicants for this position must be authorized to work for any employer in the United States (U.S.), including being located in the US. We are unable to sponsor, take over sponsorship of, or hire candidates with an employment visa at this time.
What's In It For You

We offer a comprehensive total rewards package to support our full-time employees and their family's day-to-day needs, well-being and major life events, which includes:

Fully company-paid options for medical (both in-person and virtual), dental and vision insurance
Generous paid time off (PTO) policy to enjoy periods of uninterrupted rest and relaxation for a healthy work/life balance
Paid parental leave for birth, adoption or permanent placement
401(k) with company match
Options to work in a hybrid-working model or remotely from home, depending on the position
Annual Costco membership, cell phone stipend, commuter benefits, in-office perks and more

QGenda delivers technology solutions to improve how healthcare is delivered and increase access - for everyone. We can only succeed by bringing together diverse minds, thoughts, ideas and team members to create better solutions for our customers and make us a better company as a whole. We are committed to creating a culture of embracing diversity, inclusion and equity for all.

About QGenda

QGenda is a software company that provides scheduling and workforce management solutions for healthcare organizations. The company's software helps healthcare providers manage their schedules more efficiently, reducing the time and resources required to manage complex schedules. QGenda is committed to providing high-quality software and service to its customers, and has a team of experienced professionals who are dedicated to helping healthcare organizations improve their operations. The company is headquartered in Atlanta, Georgia.

Learn more about QGenda

Size

200 employees

Industry

Information Technology

Founded

2006

* Ladders Estimates

Similar Jobs

Software Systems Engineer
$100K — $130K *
ECS
Remote
Today
Senior UAS Integration Test Engineer
$90K — $120K *
Technology Service Corporation
Huntsville, AL 35810 (Madison County)
Today
Systems Test Engineer - Level 3
$98K — $148K *
Northrop Grumman
Huntsville, AL 35810 (Madison County)
Today
Expert Site Reliability Engineer
$95K — $110K *
L3Harris
Remote
Today
Systems Analyst / Software Integrator
$85K — $110K *
Scientific Research Corporation
Charleston, SC 29406 (Charleston County)
Today
Enterprise Systems Engineer
$128K — $192K *
Schneider Electric
Atlanta, GA 30349 (Fulton County)
Reposted Today

Get Ready For Your
Next Interview

More Jobs at QGenda

Senior Site Reliability Engineer
$110K — $140K *
Atlanta, GA 30349 (Fulton County)
Today
Information Technology
In-Person
Manager, Insights
$90K — $120K *
Atlanta, GA 30349 (Fulton County)
4 days ago
Healthcare
In-Person
Senior Project Manager
$90K — $130K *
Atlanta, GA 30349 (Fulton County)
6 days ago
Healthcare
In-Person

More Information Technology Jobs

Client Partner - Banking / Financial Services / Capital Markets
$325K — $350K + $100K bonus *
Large IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
1 week ago
Business Development Director
$300K — $345K + $120K bonus *
Tier1 IT Services Firm
Kansas City, MO 64116 (Clay County)
2 weeks ago
Client Partner / Business Developemnt - Banking
$250K — $320K + $70K bonus *
IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
2 weeks ago
Technical Program Manager, Databases
$365K — $435K *
Anthropic
Seattle, WA 98115 (King County)
Today
Systems Analyst 3
$90K — $120K *
Millennium Group
Austin, TX 78745 (Travis County)
Today

Find similar Senior Site Reliability Engineer jobs:

Nationwide Atlanta, GA

Senior Site Reliability Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior Site Reliability Engineer jobs:

Get Ready For Your
Next Interview