AI Performance Engineer

Graphcore

• $120K — $160K *

Milpitas, CA 95035In-Person

Technical Services

Less than 5 years of experience

Reposted Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

BS/MS in Computer Science, Electrical Engineering, or related field
Experience with distributed systems and communication libraries (MPI, NCCL, UCX, libfabric)
Strong programming skills in C++ and Python
Experience profiling and optimizing HPC or AI/ML workloads
Familiarity with ML benchmarks such as MLPerf

Responsibilities

Analyze ML models' compute and memory requirements using roofline analysis and simulations
Collaborate across hardware and software teams to optimize large-scale AI workloads
Benchmark, monitor, and troubleshoot system performance across distributed systems
Optimize communication stacks including MPI, NCCL, UCX, RDMA, and networking fabrics
Profile and optimize AI workloads, focusing on performance bottlenecks
Develop high-quality, ARM-compatible code and documentation

Benefits

Opportunity to work at the forefront of AI and hardware innovation
Collaborative and diverse team environment
Continuous learning culture
Engagement with cutting-edge technology and transformative projects
Part of a larger organization with robust resources and capabilities

Full Job Description

Job Summary

Graphcore's AI/ML training and inference infrastructure is rapidly scaling to meet the growing demands of AI workloads across mobile, edge, and datacenter environments. This role focuses on optimizing performance across ARM-based architectures and large-scale distributed systems, ensuring efficiency, scalability, and reliability across the full hardware-software stack.
The Team

The System Engineering Performance team architects and optimizes high-performance infrastructure for large-scale datacenter deployments. The team works across hardware, software, networking, and system architecture to deliver cutting-edge AI solutions and ensure optimal system performance at scale.
Responsibilities and Duties

Analyze ML models' compute and memory requirements using roofline analysis and simulations
Collaborate across hardware and software teams to optimize large-scale AI workloads
Benchmark, monitor, and troubleshoot system performance across distributed systems
Optimize communication stacks including MPI, NCCL, UCX, RDMA, and networking fabrics
Profile and optimize AI workloads, focusing on performance bottlenecks
Develop high-quality, ARM-compatible code and documentation

Candidate Profile

Essential:

BS/MS in Computer Science, Electrical Engineering, or related field
Experience with distributed systems and communication libraries (MPI, NCCL, UCX, libfabric)
Strong programming skills in C++ and Python
Experience profiling and optimizing HPC or AI/ML workloads
Familiarity with ML benchmarks such as MLPerf

Desirable:

Experience with GPUs or accelerated computing architectures
Knowledge of HPC networking and interconnect technologies (InfiniBand, RoCE)
Familiarity with ML frameworks such as PyTorch or TensorFlow
Understanding of ARM architectures and toolchains
Strong debugging, profiling, and performance optimization skills

In addition to a competitive salary, Graphcore offers flexible working and a comprehensive benefits package designed to support your health, wellbeing and financial future. Our benefits include medical, dental and vision coverage, Flexible Spending Accounts (FSAs), Health Savings Accounts (HSAs), disability and life insurance, a 401(k) retirement plan, commuter benefits, wellness services and an Employee Assistance Programme (EAP).

* Ladders Estimates

Similar Jobs

Senior Systems Design Engineer
$95K — $120K *
Sentrillion
Remote
Today
Senior/Staff System Safety Engineer, Autonomy
$148K — $296K *
DiDi Labs
San Jose, CA 95123 (Santa Clara County)
Today
Senior Systems Engineer II
$95K — $158K *
Relx Group
San Jose, CA 95123 (Santa Clara County)
Today
Sr. Skunk Works Subsystem Design Engineer - Level 3
$85K — $150K *
Lockheed Martin
Palmdale, CA 93550 (Los Angeles County)
Today
IT Systems Engineer, Staff / Top Secret / Onsite: Palmdale, CA
$104K — $184K *
Lockheed Martin
Palmdale, CA 93550 (Los Angeles County)
Today
Senior Professional Services Engineer
$100K — $130K *
NICE Ltd.
Remote
Today

Get Ready For Your
Next Interview

More Jobs at Graphcore

Principal Hardware Diagnostics Engineer
$130K — $180K *
Milpitas, CA 95035 (Santa Clara County)
Reposted Today
Telecommunications & Hardware
In-Person
Principal Hardware Diagnostics Engineer
$130K — $180K *
Austin, TX 78745 (Travis County)
Reposted Today
Telecommunications & Hardware
In-Person
AI Performance Engineer
$120K — $160K *
Milpitas, CA 95035 (Santa Clara County)
Reposted Today
Technical Services
In-Person
Staff Manufacturing Test Engineer
$100K — $130K *
Austin, TX 78745 (Travis County)
Today
Manufacturing & Automotive
In-Person
Staff Robotics Engineer
$120K — $160K *
Austin, TX 78745 (Travis County)
Yesterday
Technical Services
In-Person

More Technical Services Jobs

Technical Manager
$90K — $130K *
iLOQ
Chicago, IL 60629 (Cook County)
Reposted Today
Fire Project Manager
$68K — $93K *
Johnson Controls
Mobile, AL 36695 (Mobile County)
Today
HVAC Service Technician
$60K — $140K *
Southern Air Heating Cooling & Plumbing
Metairie, LA 70003 (Jefferson County)
Today
Sr. Technical Engagement Manager, Algorithms and Applications
$100K — $140K *
Xanadu
Toronto, ON M3C 0E3
Today
Director of Sales, HVAC
$94K — $140K *
Parts Town
Plattsburgh, NY 12901 (Clinton County)
Today

Find similar AI Performance Engineer jobs:

Nationwide Milpitas, CA

AI Performance Engineer

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar AI Performance Engineer jobs:

Get Ready For Your
Next Interview