AI Accelerator Software Principal Engineer - Runtime Library

Ampere Computing • $182K — $273K *

Santa Clara, CA 95051In-Person

Enterprise Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

BS in Computer Science, Computer Engineering, Electrical Engineering, or Software Engineering with 8 years experience; or MS with 6 years; or PhD with 3 years.
Experience in developing user-mode drivers and/or runtime libraries for GPUs or deep learning accelerators in Linux or RTOS.
Strong proficiency in C/C++ and systems-level programming.
Background in AI framework enablement, particularly with PyTorch, llama.cpp, and ONNX.
Experience with performance engineering, profiling, diagnostics, and execution pipeline optimization.

Responsibilities

Lead the design and evolution of an AI Runtime Library for Ampere accelerators.
Own end-to-end acceleration paths across the full software/hardware stack.
Drive hardware/software co-design and optimization for throughput, latency, and memory efficiency.
Contribute to AI co-processor software enablement by collaborating with hardware and systems teams.
Integrate runtime components into Ampere platform stacks for robust deployments and consistent performance.

Benefits

Premium medical, dental, and vision insurance along with income protection and a 401K.
Unlimited Flextime and over 10 paid holidays for a healthy work-life balance.
Access to healthy snacks, energizing espresso, and refreshing drinks throughout the day.

Full Job Description

Description

About the Role

As an AI Accelerator Principal Software Engineer - Runtime Library, you will lead the design, development, and optimization of AI runtime software that enables multiple state-of-the-art deep learning models to run efficiently on Ampere's deep learning accelerators. You will work at the intersection of systems software, performance engineering, and AI enablement, helping deliver high-throughput, low-latency inference and a strong foundation for future model and framework support.

What You'll Achieve:

Build and evolve an AI Runtime Library for Ampere accelerators that supports execution, scheduling, and lifecycle management of deep learning workloads across multiple model types and popular frameworks.
Own end-to-end acceleration paths, going deep into the full SW/HW stack-including:

Inference serving and integration layers
Compiler/runtime interfaces and graph/IR execution flows
Runtime library architecture (APIs, memory management, operators, execution engines)
Communication mechanisms and device/host orchestration

Drive HW/SW co-design and optimization to improve:

Throughput (tokens/requests per second)
Latency (kernel execution and scheduling efficiency)
Memory efficiency (buffering, paging, reuse, caching)
Overall compute utilization and scaling behavior

Contribute to AI co-processor/accelerator software enablement, partnering closely with hardware and systems teams to ensure runtime and kernel strategies match accelerator capabilities and constraints.
Collaborate cross-functionally to integrate runtime components into Ampere platform stacks, ensuring robust deployment on target environments and consistent performance in production-like workloads.

About You:

BS Computer Science, Computer Engineering, Electrical Engineering, or Software Engineering or related technical field & 8 years of related experience; or MS degree & 6 years; or PhD & 3 years
Proven experience developing user-mode drivers and/or runtime libraries for GPUs or deep learning accelerators in Linux or RTOS environments.
Strong expertise in C/C++ and systems-level programming (memory, threading, synchronization, performance profiling).
Demonstrated background in AI framework enablement, with hands-on experience in one or more of:

- PyTorch (operator/runtime integration, graph execution, correctness/performance work)
- llama.cpp (inference/runtime execution patterns)
- ONNX (graph handling, interoperability, execution engines)

Strong performance engineering skills, including profiling/diagnostics and optimization of execution pipelines, data movement, and compute kernels.
Ability to operate effectively in a collaborative environment-owning complex components while partnering with compilers, hardware, and platform teams.

What We'll Offer:

At Ampere we believe in taking care of our employees and providing a competitive total rewards package that includes base pay, cash long-term incentive, and comprehensive benefits. The full base pay range for this role is between $182,000 and $273,000, except in the San Francisco Bay Area where the range is between $195,000 and $292,000.

Our benefits include health, wellness, and financial programs that support employees through every stage of life.

Benefit highlights include:

Premium medical insurance, dental insurance, vision insurance, as well as income protection and a 401K retirement plan, so that you can feel secure in your health and financial future.
Unlimited Flextime and 10+ paid holidays so that you can embrace a healthy work-life balance.
A variety of healthy snacks, energizing espresso, and refreshing drinks to keep you fueled and focused throughout the day.

And there is much more than compensation and benefits. At Ampere, we foster an inclusive culture that empowers our employees to do more and grow more. We are excited to share more about our career opportunities with you through the interview process. Our benefits include health, wellness, and financial programs that support employees through every stage of life.

#LI-Hybrid

#LI-Hybrid#LI-DR

#LI-Hybrid

About Ampere Computing

Ampere Computing is a semiconductor company that designs and manufactures high-performance processors for cloud and edge computing. The company's processors are based on the Arm architecture and are optimized for power efficiency and performance. Ampere Computing was founded in 2017 by former Intel president Renee James and is headquartered in Santa Clara, California.

Learn more about Ampere Computing

Size

200 employees

Industry

Information Technology

Founded

2017

* Ladders Estimates

Similar Jobs

Lead Principal Core Infrastructure Engineer
$135K — $306K *
Oracle Corporation
Santa Clara, CA 95051 (Santa Clara County)
Today
Principal Security Researcher (Decoder)
$162K — $263K *
Palo Alto Networks
Santa Clara, CA 95051 (Santa Clara County)
Reposted Today
REMOTE -Principal Software Developer- Agentic AI, Healthcare AI
$114K — $234K *
Oracle Corporation
Remote
Today
Principal Engineer Synthesis Team
$248K *
Lattice Semiconductor
San Jose, CA 95123 (Santa Clara County)
Today
Software Engineer, Principal - TS/SCI
$125K — $225K *
Appcast
Remote
Today
Principal Software Engineer
$174K — $225K *
Shutterfly Career Site
San Jose, CA 95123 (Santa Clara County)
Today

Get Ready For Your
Next Interview

More Jobs at Ampere Computing

AI Accelerator Software Principal Engineer - Runtime Library
$182K — $273K *
Santa Clara, CA 95051 (Santa Clara County)
Today
Enterprise Technology
In-Person
AI Accelerator Software Principal Engineer - Runtime Library
$182K — $273K *
Portland, OR 97229 (Washington County)
Today
Enterprise Technology
In-Person
Senior Manager, Indirect Material Procurement
$187K — $281K *
Portland, OR 97229 (Washington County)
2 days ago
Business Services
In-Person
Finance Controls & Compliance Analyst
$93K — $140K *
Portland, OR 97229 (Washington County)
1 week ago
Finance & Insurance
In-Person
PCIe Validation Engineer
$159K — $239K *
Santa Clara, CA 95051 (Santa Clara County)
2 weeks ago
Information Technology
In-Person

More Enterprise Technology Jobs

Senior Salesforce AI Administrator
$100K — $130K *
Jackson Healthcare LLC
Remote
Today
Manager, Customer Success Enterprise
$134K — $201K *
ServiceTitan
Remote
Reposted Today
RedHat Pre-Sales Architect
$100K — $130K *
CruiTek
Atlanta, GA 30349 (Fulton County)
Today
Senior Systems Engineer
$120K — $150K *
VAST Data
Houston, TX 77084 (Harris County)
Reposted Today
Sr. Software Dev. Engineer, SageMaker JumpStart Builder Experience
$184K — $250K *
Amazon
New York, NY 10025 (New York County)
Today

Find similar AI Accelerator Software Principal Engineer - Runtime Library jobs:

Nationwide Santa Clara, CA

AI Accelerator Software Principal Engineer - Runtime Library

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar AI Accelerator Software Principal Engineer - Runtime Library jobs:

Get Ready For Your
Next Interview