OpenAI

Software Engineer, Inference - Performance Optimization

OpenAI$130K — $180K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in distributed systems or performance engineering
  • Strong understanding of model inference and hardware efficiency
  • Expertise in performance profiling and benchmarking
  • Familiarity with data analysis and cost modeling
  • Ability to collaborate with cross-functional teams

Responsibilities

  • Build and refine performance models based on microbenchmark data
  • Analyze end-to-end inference workloads across system layers
  • Enhance tools to pinpoint latency and throughput bottlenecks
  • Collaborate with teams to convert performance insights into actionable improvements
  • Project future performance impacts based on system changes

Benefits

  • Opportunity to work with cutting-edge technology in inference systems
  • Collaborative team environment focused on performance optimization
  • Engagement with cross-functional teams in engineering and research
  • Focus on real-world applications and production system improvements
Full Job Description
About the Team
Our team analyzes inference stack performance across the application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference. We combine systems profiling, benchmarking, and analysis to understand where time and cost are spent, then turn that understanding into performance optimizations and models that project performance and capacity needs for future launches.

About the Role
In this role, you will model inference performance across application, model, and fleet layers with higher fidelity. You will build cost-to-serve estimates from microbenchmarks and create tools that help cross-functional teams reason about latency, capacity, utilization, and cost tradeoffs.

In this role, you will:
  • Build and refine performance models that translate microbenchmark results into cost-to-serve estimates.
  • Analyze inference workloads end to end across applications, models, and fleet infrastructure.
  • Enhance tooling to identify bottlenecks across layers for latency and throughput.
  • Partner with other teams to turn performance insights into concrete improvements and project how future changes affect inference.

You might thrive in this role if you:
  • Enjoy reasoning from first principles about distributed systems, model inference, and hardware efficiency.
  • Are comfortable working across abstraction layers, from application behavior to kernels, accelerators, networking, and fleet scheduling.
  • Have deep expertise with performance profiling, benchmarking, analysis, and optimization.
  • Enjoy collaborating with engineering and research teams to improve real production systems.


About OpenAI

OpenAI is an artificial intelligence research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company was founded in 2015 by a group of technology leaders, including Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, and John Schulman. OpenAI's mission is to develop and promote friendly AI for the betterment of humanity. The company has developed a number of cutting-edge AI technologies, including GPT-3, a language processing system that can generate human-like text. OpenAI has received funding from a number of high-profile investors, including LinkedIn co-founder Reid Hoffman and venture capitalist Peter Thiel.
Learn more about OpenAI
Size
100 employees
Industry
Founded
2015

Similar Jobs

More Jobs at OpenAI

More Information Technology Jobs

Find similar Software Engineer, Inference - Performance Optimization jobs: