OpenAI

Performance Modeling Lead

OpenAI$130K — $180K *
Enterprise Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Experience building performance modeling frameworks for system design decisions.
  • Deep understanding of AI/ML workloads, focusing on training and inference.
  • Knowledge of system tradeoffs in large-scale distributed systems.
  • Ability to connect workload behavior with hardware implementation.
  • Experience using modeling techniques to support architectural decisions.
  • Comfortable navigating ambiguous problem spaces and structuring analysis.
  • Strong communication skills for influencing teams and partners.

Responsibilities

  • Build and manage a performance modeling framework/toolchain for AI systems.
  • Analyze architectural tradeoffs in compute, memory, networking, and storage.
  • Develop models for key design decisions on architecture scaling and network design.
  • Translate modeling results into actionable recommendations for teams and vendors.
  • Influence designs and vendor roadmaps with data-driven insights.
  • Collaborate with machine learning and hardware teams to understand workload needs.
  • Lead and develop a small team while maintaining high modeling standards.
  • Improve modeling accuracy through validation against actual system performance.

Benefits

  • Hybrid work model with three days in the office per week.
  • Relocation assistance available for new hires.
Full Job Description
About the Role

We are seeking a Performance Modeling Lead to build and lead a small, high-impact team responsible for answering forward-looking architectural questions across AI infrastructure systems.

You will develop modeling frameworks and methodologies to evaluate system-level tradeoffs and guide key design decisions. Your work will directly influence reference architectures, vendor designs, and long-term infrastructure strategy.

This role sits at the intersection of AI workloads, system architecture, and quantitative modeling, and requires strong technical judgment, ownership, and the ability to translate complex analysis into clear, actionable guidance.

This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance.

Key Responsibilities
  • Build and own a performance modeling framework/toolchain to evaluate AI systems across multiple levels of abstraction.
  • Analyze and quantify architectural tradeoffs across compute, memory, networking, storage, and system topology.
  • Develop performance models to guide decisions on:
    • scale-up vs. scale-out architectures
    • interconnect and network design
    • memory hierarchy and system balance.
  • Translate modeling outputs into clear recommendations for internal teams and external hardware vendors.
  • Influence reference designs and vendor roadmaps through data-driven insights.
  • Partner closely with machine learning, systems, and hardware teams to understand workload characteristics and requirements.
  • Lead and grow a small team (2-3 engineers), setting technical direction and maintaining high standards for modeling rigor.
  • Continuously improve modeling fidelity by validating against real system behavior and measurements.


Qualifications
  • Have experience owning or building performance modeling frameworks used to drive real system design decisions.
  • Have deep knowledge of AI/ML workloads, including training and/or inference at scale.
  • Understand system-level tradeoffs across compute, memory, and networking in large-scale distributed systems.
  • Are comfortable working across abstraction layers-from workload behavior to hardware implementation.
  • Have experience using modeling (analytical or simulation) to inform architectural decisions.
  • Can operate in ambiguous problem spaces and turn open-ended questions into structured analysis.
  • Communicate clearly and influence both internal teams and external partners.


Preferred Skills
  • Experience working with hardware vendors (ODM/JDM, silicon, networking).
  • Background in data center infrastructure or hyperscale systems.
  • Familiarity with accelerators (GPUs/ASICs) and interconnects (e.g., NVLink, InfiniBand, Ethernet).
  • Experience influencing hardware roadmaps or reference architectures.
  • Prior experience leading or mentoring engineers.


About OpenAI

OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company was founded in 2015 by a group of technology leaders, including Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, and John Schulman. OpenAI's mission is to develop and promote friendly AI for the betterment of humanity. The company has developed a number of cutting-edge AI technologies, including GPT-3, a language processing system that can generate human-like text. OpenAI has received funding from a number of high-profile investors, including LinkedIn co-founder Reid Hoffman and venture capitalist Peter Thiel.
Learn more about OpenAI
Size
100 employees
Industry
Founded
2015

Similar Jobs

More Jobs at OpenAI

More Enterprise Technology Jobs

Find similar Performance Modeling Lead jobs: