OpenAI

Staff Security Reliability Engineer

OpenAI$130K — $180K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10+ years of hands-on experience in operating and architecting mission-critical infrastructure
  • Senior technical owner for complex on-prem and hybrid systems
  • Proficient in Site Reliability Engineering principles
  • Experience in environments requiring high reliability and recoverability
  • Skilled in influencing cross-functional teams through clear technical communication
  • Familiar with fleet and virtual desktop platforms
  • Experience with security engineering teams for policy-enforced infrastructure

Responsibilities

  • Design, build, and operate reliable infrastructure across various environments
  • Establish standardized infrastructure patterns to enhance security and repeatability
  • Own the lifecycle of critical infrastructure platforms including upgrades and recovery
  • Implement infrastructure-as-code with tools like Terraform and Chef
  • Mature identity and policy-enforced infrastructure management
  • Create observability and incident response mechanisms for improved operational confidence
  • Automate workflows with safe rollback paths and rollout patterns
  • Translate operational learnings into durable fixes and reusable patterns

Benefits

  • Work in a high-leverage technical environment
  • Opportunity to influence the evolution of reliability standards
  • Hands-on role with complex system ownership
  • Engagement in cutting-edge security practices
  • In-office presence at the San Francisco HQ for collaboration and team dynamics
Full Job Description
About the Team

The Infrastructure Engineering function sits within IT and is responsible for reliably building, deploying, and operating critical on prem and hybrid environments that power internal services and critical R&D environments.

This is an early, high-leverage technical role focused on applying strong Site Reliability Engineering discipline to environments where uptime, safety, recoverability, and security are non-negotiable. This person helps replace bespoke, one-off infrastructure with standardized infrastructure-as-code building blocks that compound reliability and operational leverage as OpenAI scales.

About the Role

We are looking for an experienced Security Reliability Engineer to design, build, and operate reliable, secure, and scalable infrastructure that underpins identity, access, endpoint, and shared platform services across the company.

In this role, you will be a senior technical owner for infrastructure and identity systems end to end, from architecture and implementation through policy enforcement, upgrades, recovery, and day-two operations. You will build durable, production-grade platforms that remove operational friction, enforce security by default, and enable teams to move faster with confidence.

This role is well suited for a hands-on senior engineer who thrives in ambiguity, enjoys owning complex systems end to end, and raises the reliability and security bar by replacing fragile implementations with standardized, repeatable infrastructure.

This role is based in our San Francisco HQ and requires in-office presence.

In this role, you will:
  • Design, build, and operate reliable infrastructure across on-prem, hybrid, shared, and product adjacent environments.
  • Establish standardized infrastructure patterns that replace bespoke implementations with repeatable, auditable, secure-by-default systems.
  • Own the lifecycle of critical infrastructure platforms, including provisioning, deployment, upgrades, patching, recovery, and long-term reliability.
  • Build infrastructure-as-code and configuration management using tools such as Terraform, Chef, and Ansible.
  • Mature identity adjacent and policy enforced infrastructure, including Microsoft Entra and Azure management patterns.
  • Build observability, alerting, and incident response mechanisms that improve availability, recoverability, and operational confidence.
  • Automate high-toil and high-risk workflows with guardrails, progressive rollout patterns, and safe rollback paths.
  • Translate incidents, design reviews, and operational learnings into durable fixes, reusable patterns, and stronger technical standards.


You might thrive in this role if you:
  • Have 10+ years of hands-on experience operating and architecting mission-critical infrastructure in high-reliability environments
  • Have been the senior technical owner for the design and maturation of complex on-prem, hybrid, or cloud-integrated systems, setting durable architectural patterns used by multiple teams
  • Apply Site Reliability Engineering principles at scale, using observability, automation, and incident learnings to materially reduce risk and operational toil
  • Operate comfortably in ambiguity, making sound architectural decisions under pressure while staying close to technical detail
  • Influence cross-functional partners across security, identity, network, and platform teams through architecture, implementation, operational data, and clear technical writing
  • Have experience operating infrastructure for R&D or specialized labs, manufacturing, or other safety critical environments where uptime and recoverability are essential
  • Have experience with fleet, endpoint, or virtual desktop platforms such as FleetDM, Chef, or Azure Virtual Desktop
  • Have experience partnering closely with identity or security engineering teams on hardened, policy enforced infrastructure at scale


About OpenAI

OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company was founded in 2015 by a group of technology leaders, including Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, and John Schulman. OpenAI's mission is to develop and promote friendly AI for the betterment of humanity. The company has developed a number of cutting-edge AI technologies, including GPT-3, a language processing system that can generate human-like text. OpenAI has received funding from a number of high-profile investors, including LinkedIn co-founder Reid Hoffman and venture capitalist Peter Thiel.
Learn more about OpenAI
Size
100 employees
Industry
Founded
2015

Similar Jobs

More Jobs at OpenAI

More Information Technology Jobs

Find similar Staff Security Reliability Engineer jobs: