Site Reliability Engineer

Analytic Partners • $100K — $130K *

Dallas, TX 75217Hybrid

Enterprise Technology

Less than 5 years of experience

3 weeks ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree in Computer Science or equivalent experience.
4+ years in Platform Engineering, Site Reliability Engineering, DevOps, or Systems Engineering.
Strong expertise in Linux and Windows operating systems.
Advanced automation/scripting skills using Python, Bash, or PowerShell.
Hands-on experience with AWS and Azure platforms at scale.
Experience with CI/CD platforms (e.g., Jenkins, GitHub Actions).
Production experience with containers and orchestration platforms (e.g., Docker, Kubernetes).

Responsibilities

Own and optimize the Internal Developer Platform (IDP) for engineering teams.
Define and execute a platform roadmap aligned with business and developer needs.
Design and evolve application delivery paved roads including CI/CD pipelines.
Build self-service capabilities to minimize friction for service deployment and operation.
Create reusable platform abstractions for AWS and Azure focusing on security and reliability.
Drive automation to reduce operational toil and improve platform solutions.
Lead platform-level incident response and post-incident review processes.

Benefits

Opportunity to work in a fast-paced, cloud-centric environment.
Mentorship opportunities and involvement in technical evangelism.
Participation in a global, follow-the-sun operation model.
Focus on continuous improvement and learning culture.
Access to cutting-edge technology in platform engineering.

Full Job Description

What You'll Be Doing

Own the Internal Developer Platform (IDP) as a product, treating engineering teams as customers and optimizing for reliability, usability, and delivery velocity.
Define and execute a platform roadmap aligned with business priorities, developer needs, and long-term scalability.
Design, build, and evolve paved roads for application delivery, including CI/CD pipelines, infrastructure templates, service scaffolding, and standardized deployment patterns.
Build self-service capabilities that enable teams to provision, deploy, observe, and operate services with minimal friction.
Create and maintain reusable platform abstractions across AWS and Azure that standardize security, reliability, networking, and observability.
Reduce developer cognitive load by abstracting unnecessary complexity while enforcing clear guardrails for security, cost, and compliance.
Partner closely with application, product, and security teams to embed reliability, scalability, and security by design.
Establish and evolve platform standards for logging, monitoring, alerting, tracing, and incident response workloads.
Define, measure, and manage SLIs, SLOs, and error budgets for shared platform services.
Drive the reduction of operational toil through automation, standardization, and platform-first solutions.
Ensure shared platform services meet high standards for availability, performance, resilience, and scalability.
Own system-to-system integration and messaging patterns used across the platform.
Lead capacity planning, demand forecasting, and performance tuning for platform services.
Plan and execute zero-downtime upgrades, migrations, and releases of platform components.
Lead platform-level incident response workflows, post-incident reviews, and drive systemic improvements rather than one-off fixes.
Evaluate incoming platform requests and translate them into scalable, productized capabilities.
Mentor engineers and drive platform adoption through documentation, enablement, and technical evangelism.
Participate in a 24x7 on-call rotation as an escalation point for platform reliability and availability issues.
Operate effectively in ambiguous problem spaces, making sound architectural and product decisions with limited guidance.

What We Look For In You:

Bachelor's degree in Computer Science or equivalent practical experience.
4+ years of experience in Platform Engineering, Site Reliability Engineering, DevOps, or Systems Engineering roles.
Strong expertise in Linux and Windows operating systems.
Advanced automation and scripting skills using Python, Bash, and/or PowerShell.
Deep, hands-on experience designing and operating AWS and Azure platforms at scale.
Strong experience building and operating CI/CD platforms (Jenkins, GitHub Actions or equivalent).
Strong experience with Infrastructure as Code and configuration management (Terraform, CloudFormation, ARM, or similar).
Production experience with containerized and orchestration platforms such as Docker and Kubernetes.
In-depth experience with the HashiCorp ecosystem (Nomad, Consul, Vault).
Strong understanding of distributed systems, cloud-native architectures, and reliability patterns.
Experience designing and operating observability platforms (e.g., Splunk, Sumo Logic, or similar).
Familiarity with security and compliance practices, including vulnerability scanning and enterprise security tooling.
Strong understanding of the software delivery lifecycle, release engineering, and platform lifecycle management.
Experience working in Agile / DevOps environments with a strong product mindset.
Demonstrated ability to influence without authority, set standards, and drive adoption across teams.
Excellent communication skills, able to translate platform capabilities into clear developer value.
Strong problem-solving skills with a bias toward durable, scalable solutions over short-term fixes.
A mindset of continuous improvement, curiosity, and learning.
Comfortable supporting a global, follow-the-sun operation when needed.

How We Measure Success:

Strong developer adoption and satisfaction with the platform (DX).
Reduced deployment friction, lead time, and operational toil.
Platform reliability and performance meeting or exceeding defined SLOs.
Consistent, high-quality service delivery across engineering teams.
Reduced incident frequency and severity driven by systemic platform improvements.
Increased standardization, automation, and self-service adoption across the organization.

About Analytic Partners

Analytic Partners is a marketing analytics company that provides data-driven insights to help businesses make better decisions. The company was founded in 2000 and has since grown to become a global leader in marketing analytics. Analytic Partners' services include marketing mix modeling, digital analytics, and customer analytics. The company works with clients in a variety of industries, including consumer goods, financial services, and healthcare. Analytic Partners is headquartered in New York City and has offices in Europe and Asia.

Learn more about Analytic Partners

Size

500 employees

Industry

Media

Net Income

$5 million

Founded

2000

5 Year Trend

+20%

Revenue

$50 million

* Ladders Estimates

Similar Jobs

Application Support Engineer, Service Reliability Engineering
$78K — $125K *
Ciena
Remote
Reposted Today
(USA) Site Reliability Operations III
$80K — $155K *
Walmart
Siloam Springs, AR 72761 (Benton County)
4 days ago
DevOps & Site Reliability Engineer
$90K — $130K *
VoltaGrid
Houston, TX 77084 (Harris County)
3 weeks ago
Site Reliability Engineer, Apple Data Platform
$120K — $150K *
Apple
Austin, TX 78745 (Travis County)
1 month ago

Get Ready For Your
Next Interview

More Jobs at Analytic Partners

Learning & Development Manager
$90K — $120K *
Miami, FL 33186 (Miami-Dade County)
5 days ago
Education, Government & Non-Profit
Hybrid
Consultant, Marketing Science Analytics
$110K — $140K *
New York, NY 10025 (New York County)
1 week ago
Business Services
Hybrid
Site Reliability Engineer
$100K — $130K *
Dallas, TX 75217 (Dallas County)
3 weeks ago
Enterprise Technology
Hybrid
Sr. Data Scientist
$95K — $120K *
New York, NY 10025 (New York County)
1 month ago
Information Technology
Hybrid

More Enterprise Technology Jobs

AI Enablement Specialist
$100K — $115K *
Axis Communications
Chelmsford, MA 01824 (Middlesex County)
Today
Configurator Developer Engineer (Oracle CPQ)
$85K — $110K *
Nidec Automatic Feed
St. Louis, MO 63129 (Saint Louis County)
Today
Manager, SAP SD Public Cloud
$100K — $130K *
KPMG
Calgary, AB T1Y 7M8
Today
Sr. ERP Developer
$160K — $165K *
Cape Cod Healthcare
Hyannis, MA 02601 (Barnstable County)
Today
Technical Program Manager - Engineering Systems Integration
$105K — $180K *
KLA Tencor
Ann Arbor, MI 48103 (Washtenaw County)
Reposted Today

Find similar Site Reliability Engineer jobs:

Nationwide Dallas, TX

Site Reliability Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Site Reliability Engineer jobs:

Get Ready For Your
Next Interview