Site Reliability Engineer

Optimum

• $100K — $130K *

Bethpage, NY 11714In-Person

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor's degree in Telecommunications, Computer Engineering, or related field
2-4 years in mobile network operations or systems engineering roles
Deep proficiency in Linux (RHEL/Ubuntu) and Unix (Solaris/AIX)
Hands-on experience with Google Cloud Platform (IAM, VPC, Compute Engine)
Proven experience using Terraform and Ansible for managing environments
Proficiency in Storage Protocols like Fiber Channel and iSCSI
Strong scripting ability in Python or Go.

Responsibilities

Audit, harden, and standardize Unix and Linux environments across GCP and on-premises servers.
Architect and manage enterprise-grade SAN/NAS environments and optimize for low latency.
Serve as the engineering lead for Eastern U.S. data centers, ensuring hardware health and security standards.
Design and maintain automation pipelines to eliminate configuration drift between environments.
Establish a sustainable, automated patching cadence to enhance fleet security.
Implement and scale a monitoring stack to provide real-time health metrics across the hybrid estate.
Participate in on-call rotation and lead blameless post-mortems after incidents.

Benefits

Opportunity to work on hybrid cloud infrastructure
Focus on automation and modern SRE practices
Engage in continuous learning through blameless post-mortems
Participation in cutting-edge storage and security technologies
Collaborative team environment fostering innovation

Full Job Description

Job Summary

As a Site Reliability Engineer II, you will be a primary driver in the long-term management and stabilization of our Hybrid Cloud infrastructure. We maintain a permanent dual-hosting strategy, operating both Google Cloud Platform (GCP) and mission-critical On-Premises Unix/Linux footprint. You will bridge the gap between physical hardware and modern cloud-native operations, applying software engineering principles to ensure our systems are scalable, secure, and predictable across all platforms.

The Mission: Hybrid Reliability & Stabilization

Your mission is to unify our GCP and On-Premises environments into a single, reliable platform. Your first 12 months will focus on Stabilization and Observability. You will lead the transition away from "toil" (manual, repetitive operations) toward high-leverage automation, aggressively addressing on-prem technical debt while implementing modern SRE practices across our global data centers and cloud projects.

Responsibilities

Hybrid Platform Standardization: Audit, harden, and standardize Unix (Solaris/AIX) and Linux (RHEL/Ubuntu) environments across both GCP Compute Engine and physical bare-metal servers.
Storage Engineering (Specialization): Architect and manage enterprise-grade SAN/NAS environments alongside GCP Cloud Storage/Persistent Disk.
Optimize for low latency and high IOPS while ensuring all data-at-rest complies with our Annual Encryption Strategy.
Infrastructure Stewardship (DC Support): Serve as the engineering lead for our Eastern U.S. data centers; ensure hardware health, power redundancy, and physical security standards are enforced through code and automated checks.
Automation of Toil: Design and maintain robust automation pipelines (Ansible, Terraform, Python) to ensure configuration parity and eliminate drift between cloud and on-premises environments.
Vulnerability Management: Transition the fleet from a "vulnerable" state to a "reliable" one by establishing a sustainable, automated monthly patching cadence.
Unified Observability: Implement and scale a "single pane of glass" monitoring stack (Prometheus, Grafana, Loki) to provide real-time health metrics for the entire hybrid estate.
Incident Response & Post-Mortems: Participate in a sustainable on-call rotation. Lead Blameless Post
Mortems for incidents involving cross-platform dependencies to ensure we "fix the system, not the person."

Qualifications

Bachelor's degree in Telecommunications, Computer Engineering, or related technical field
2-4 years of experience in mobile network operations or systems engineering roles
OS Internals: Deep proficiency in Linux (RHEL/Ubuntu) and Unix (Solaris/AIX) administration and kernel tuning
Cloud Proficiency: Hands-on experience with GCP (IAM, VPC, Compute Engine) or equivalent public cloud providers
Infrastructure as Code: Proven ability to manage complex environments using Terraform and Ansible
Storage Protocols: Proficiency in Fiber Channel, iSCSI, and NFS. Experience with enterprise arrays (NetApp, Dell/EMC, or Pure Storage) is highly preferred
Software Engineering: Strong scripting ability in Python or Go to build internal tools and automation
Security: Strong understanding of CVE lifecycles and cryptographic standards (AES-256)

All job descriptions and required skills, qualifications and responsibilities for a particular position are subject to modification by the Company from time to time, in the Company's discretion based on business necessity.

* Ladders Estimates

Similar Jobs

Eng Sr Prin II - Sys
$120K — $150K *
BAE Systems
Sterling, VA 20164 (Loudoun County)
Reposted Today
Eng Sr Prin II - Sys
$120K — $150K *
BAE Systems
Herndon, VA 20171 (Fairfax County)
Today
AAM & UAS Noise Expert - Aviation Projects
$90K — $120K *
HARRIS MILLER MILLER & HANSON INC.
Remote
Today
Naval Systems Security Engineer
$125K — $145K *
Amentum
Dahlgren, VA 22448 (King George County)
Reposted Today
Senior Systems Engineer (Hybrid - Acton, MA)
$102K — $153K *
Insulet Corporation
Acton, MA 01720 (Middlesex County)
Today
Senior Ocean Engineer
$100K — $245K *
Johns Hopkins Applied Physics Lab
Laurel, MD 20707 (Prince Georges County)
Today

Get Ready For Your
Next Interview

More Jobs at Optimum

Manager - Product Analytics
$143K — $236K *
Toronto, ON M3C 0E3
Today
Enterprise Technology
In-Person
VP Finance B2B
$178K — $255K *
Long Island City, NY 11101 (Queens County)
Today
Telecommunications & Hardware
In-Person
Site Reliability Engineer
$100K — $130K *
Bethpage, NY 11714 (Nassau County)
Today
Information Technology
In-Person
VP, Finance Strategy Integr. (70010385)
$178K — $255K *
Remote
Today
Finance & Insurance
Remote in Arizona, US
Door to Door Sales Representative
$85K — $115K *
Bryan, TX 77803 (Brazos County)
Today
Telecommunications & Hardware
In-Person

More Information Technology Jobs

Information Technology (IT) Manager
$90K — $130K *
Wieland
Lake Zurich, IL 60047 (Lake County)
Today
AI Engineer
$100K — $130K *
Chenega MIOS
Huntsville, AL 35810 (Madison County)
Today
Cybersecurity Engineer
$80K — $110K *
Chenega MIOS
Albuquerque, NM 87121 (Bernalillo County)
Today
Program Manager
$90K — $120K *
Chenega MIOS
Albuquerque, NM 87121 (Bernalillo County)
Today
Associate Software Engineer
$77K — $97K *
Cotiviti
Remote
Today

Find similar Site Reliability Engineer jobs:

Nationwide Bethpage, NY

Site Reliability Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Site Reliability Engineer jobs:

Get Ready For Your
Next Interview