AI Systems Engineer, Codex Agents

OpenAI • $130K — $180K *

San Francisco, CA 94112In-Person

Information Technology

Less than 5 years of experience

1 month ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5-7 years of experience in production systems and distributed infrastructure
Strong background in building and operating ML systems
Hands-on proficiency in debugging across various technical layers
Experience with LLM applications and agent coding
Deep understanding of performance, reliability, and safety in systems
Ability to work closely with research and production environments
Proven leadership in scoped or multi-team systems projects

Responsibilities

Design and build the core agent harness for Codex agents' functionality
Create sandboxing and orchestration infrastructure for agent operation
Develop systems for effective evaluation and debugging of model behavior
Conduct ablation studies to enhance agent performance metrics
Improve diagnostics and observability throughout the agent stack
Collaborate with research to enhance trainability and utility of the harness
Build shared resources to facilitate faster, safer, and more reliable Codex operations

Benefits

Collaborative work environment alongside leading AI researchers
Opportunity to influence AI systems at a large scale
Engagement in cutting-edge technologies and projects
Potential to contribute to open-source developments
Focus on impactful AI applications for global challenges

Full Job Description

AI Systems Engineer - Codex Core Agents

About The Team
The Codex Core Agents team builds the agent harness that turns model capability into real-world action. We own the systems around the model: prompting and interpreting model outputs, executing actions safely in real environments, and feeding production experience back into better models and better agent behavior.
This team sits close to research and works across the stack: harness, model interaction, inference, sandboxed execution, orchestration, evals, production reliability, and the performance envelope around tokens, latency, cost, capacity, and quality. The harness is open source and increasingly part of how models are trained and evaluated, making this one of the highest-leverage layers in Codex.

About The Role
We're looking for engineers to build the AI systems that make Codex agents dependable in production. The ideal candidate is an agent-systems builder: hands-on across low-level systems and ML workflows, able to debug Codex behavior end to end across the harness, model behavior, inference/runtime stack, GPU fleet, and product surface.

You'll work with research, infrastructure, and product to design agent harness capabilities, run experiments and ablations across the model + system prompt + harness stack, build frameworks for assessing production agent performance, and turn messy failures into durable improvements.

What You'll Do

Design and build the core agent harness and execution loop that lets Codex agents interpret model outputs, use tools, execute code, and complete long-horizon tasks safely.
Build sandboxing, isolation, orchestration, state, and workflow infrastructure for agents operating in real development environments.
Develop evaluation, experimentation, and debugging systems that distinguish harness issues, model behavior, inference/runtime issues, and product failures.
Run ablations across prompts, model-facing interfaces, context construction, tool-use strategies, and harness behavior to improve solve rate, reliability, latency, and cost.
Improve observability, profiling, and diagnostics across the agent stack, from backend systems to inference, GPUs, and fleet capacity.
Work closely with research to make the harness trainable, measurable, and useful for improving frontier agentic models.
Build shared primitives that make Codex faster, safer, more reliable, and easier for other teams and open-source users to build on.
You Might Be A Good Fit If You
Have built or operated production systems in distributed systems, infrastructure, developer tooling, sandboxing, virtualization, cloud platforms, or ML systems.
Enjoy working across layers: Rust systems code, Python configuration layers, APIs, agent orchestration, evals, logs/traces, inference behavior, runtime constraints, and user outcomes.
Have hands-on experience with LLM applications, coding agents, evals, model deployment, inference, compiler/runtime performance, or developer platforms.
Care deeply about reliability, safety, performance, debuggability, and clean abstractions.
Can debug from evidence and move quickly from ambiguous production failures to practical, durable fixes.
Want to work close to research while still shipping changes to production
Still write meaningful code, show strong ownership, and can lead scoped or multi-team AI systems work.

Bonus Points

Deep Rust, systems, sandboxing, isolation, or low-level platform experience.
Experience with coding agents, agent harnesses, tool-using LLM systems, model evals, or post-training feedback loops.
Background in compilers, kernels, runtimes, inference optimization, GPU systems, benchmarking, profiling, or performance engineering.
Experience building production infrastructure used by many engineers or users under demanding reliability and security constraints.
Open-source infrastructure or developer-platform work with strong taste for APIs and usability.

At OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.

About OpenAI

OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company was founded in 2015 by a group of technology leaders, including Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, and John Schulman. OpenAI's mission is to develop and promote friendly AI for the betterment of humanity. The company has developed a number of cutting-edge AI technologies, including GPT-3, a language processing system that can generate human-like text. OpenAI has received funding from a number of high-profile investors, including LinkedIn co-founder Reid Hoffman and venture capitalist Peter Thiel.

Learn more about OpenAI

Size

100 employees

Industry

Information Technology

Founded

2015

* Ladders Estimates

Similar Jobs

Software Engineer, Machine Learning RecSys
$130K — $180K *
Meta
Sunnyvale, CA 94087 (Santa Clara County)
Today
Forward Deployed Engineer, AI Enablement
$120K — $150K *
STORD
Remote
Reposted Today
Senior Machine Learning Engineer, Agentic Systems - Moveworks
$130K — $180K *
ServiceNow
Mountain View, CA 94040 (Santa Clara County)
Reposted Today
Lead AI Engineer
$120K — $160K *
EPAM Systems
Remote
Reposted Today
AI Engineer, Developer Experience
$154K — $315K *
Gem
San Francisco, CA 94112 (San Francisco County)
Today
Senior AI Engineer
$120K — $160K *
NovoEd, Inc.
Remote
Reposted Today

Get Ready For Your
Next Interview

More Jobs at OpenAI

Sales Manager, Enterprise Digital Natives
$150K — $200K *
San Francisco, CA 94112 (San Francisco County)
Today
Enterprise Technology
In-Person
Executive Business Partner, B2B Ads Sales
$120K — $150K *
New York, NY 10025 (New York County)
Yesterday
Business Services
In-Person
Founding Full Stack Software Engineer, Legal
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Yesterday
Legal & Accounting
In-Person
Product Manager, ChatGPT Sites
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
Yesterday
Consumer Technology
In-Person
Device Safety & Risk Operations Specialist, User Safety & Risk Operations
$120K — $150K *
San Francisco, CA 94112 (San Francisco County)
3 days ago
Consumer Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
2 weeks ago
Senior ServiceNow Engineer
$110K — $140K *
Comcast
Reston, VA 20191 (Fairfax County)
Today
Senior Data Engineer
$135K — $205K *
Ardent Eagle Solutions
Arlington, VA 22204 (Arlington County)
Today
Senior Manager - Salesforce
$136K — $230K *
MiniMed
Johns Creek, GA 30022 (Fulton County)
Today
Senior Manager, Cybersecurity
$147K — $170K *
Leprino Foods
Denver, CO 80219 (Denver County)
Today

Find similar AI Systems Engineer, Codex Agents jobs:

Nationwide San Francisco, CA

AI Systems Engineer, Codex Agents

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar AI Systems Engineer, Codex Agents jobs:

Get Ready For Your
Next Interview