Senior Software Engineer, Inference

Pika

• $130K — $180K *

Palo Alto, CA 94303In-Person

Information Technology

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years of engineering experience specializing in inference acceleration and model deployment.
Proven expertise in inference optimization techniques like quantization and attention acceleration.
Deep knowledge of GPU programming including CUDA and NCCL.
Familiarity with video generation models and large language models (LLMs).
Strong communication skills for cross-discipline collaboration.
Proactive and solutions-driven with an ownership mindset.

Responsibilities

Lead and implement advanced inference acceleration techniques for efficient model serving.
Engineer and optimize GPU strategies for maximal efficiency in parallel processing.
Develop high-performance computing kernels and distributed workloads.
Collaborate with teams to deploy state-of-the-art AI models in production.
Contribute to improvements in training speed and resource utilization as a bonus.
Drive rigorous code reviews and mentor engineers on inference best practices.

Benefits

Equity in a fast-growing startup shaping the future of AI.
Comprehensive health benefits and monthly stipends.
Company retreats for team bonding and culture building.
Supportive, collaborative office culture focused on collective success.

Full Job Description

About the Role

We are seeking a Senior Inference Engineer to accelerate the performance of Pika's AI-driven products. In this highly technical role, you will operate at the intersection of cutting-edge inference acceleration, GPU parallelism, advanced model deployment, and video generation technologies. Your expertise will drive significant improvements to model speed and efficiency, ensuring our creative AI systems deliver industry-leading user experiences at scale.

You will design and optimize inference pipelines, implement state-of-the-art acceleration techniques, and work closely with researchers and engineers across the team to push the boundaries of what's possible in real-time AI deployment. Your efforts will play a foundational role in powering the next generation of Pika's video and language models.

What You'll Do

Accelerate Inference: Lead and implement advanced inference acceleration techniques, including attention optimization and quantization for efficient model serving.
Maximize GPU Parallelism: Engineer and optimize GPU strategies across tensor, sequence, and pipeline parallelism (TP, SP, PP) for maximal efficiency and scalability.
Programming for Performance: Develop and optimize high-performance computing kernels and distributed workloads using CUDA and NCCL.
Advance AI Deployment: Collaborate with research and engineering teams to bring state-of-the-art videogen and large language models into production.
Improve Training Efficiency: (Bonus) Contribute to improvements in model training speed, stability, and resource utilization as part of our deployment lifecycle.
Technical Excellence: Drive rigorous code reviews, participate in technical discussions, and mentor fellow engineers on best practices in inference and GPU programming.

What We're Looking For

Experience: 5+ years engineering experience, with a strong track record in inference acceleration and model deployment at scale.
Inference Mastery: Proven expertise in inference optimization, including quantization, attention acceleration, and deep learning compiler stacks.
GPU & Parallelism: Deep knowledge of GPU programming (CUDA, NCCL) and experience with SP, TP, PP, and other forms of parallelism for distributed inference.
AI Domain Knowledge: Familiarity with video generation (videogen) models and large language models (LLMs).
Collaboration: Strong cross-discipline communication skills; able to drive shared goals across research and engineering functions.
Ownership Mindset: Self-driven, solutions-oriented, and capable of managing ambiguity in a fast-paced startup environment.
Bonus: Experience in enhancing training efficiency, stability, or resource optimization for large models.

Nice to Have

Experience with high-throughput video or real-time streaming model deployment
Familiarity with distributed training and optimization toolkits
Contributions to open source projects in AI infrastructure or deep learning compilers
Startup or rapid prototyping experience

What We Offer

Competitive salary in the AI industry
Equity in a fast-growing startup shaping the future of AI
Comprehensive health benefits, monthly stipends, company retreats
A supportive and collaborative office culture-we're all building and launching together

We work from our Palo Alto office 3-5 days a week and welcome applicants who are eager to contribute onsite.

* Ladders Estimates

Similar Jobs

Software Engineer - Platform, Mission Systems
$127K — $191K *
Planet Labs
Remote
Today
Sr Software Developer - PEGA Platform
$110K — $204K *
BECU
Remote
Today
Sr Software Engineer
$155K — $208K *
The Walt Disney Company
San Francisco, CA 94112 (San Francisco County)
Reposted Today
Senior Systems Test Engineer
$122K — $184K *
Qualcomm
Santa Clara, CA 95051 (Santa Clara County)
Today
Delivery Project Lead
$120K — $150K *
Mphasis
San Francisco, CA 94112 (San Francisco County)
Reposted Today
Senior Software Engineer, Hyperscale Developer Productivity
$180K — $270K *
Pure Storage
Santa Clara, CA 95051 (Santa Clara County)
Today

Get Ready For Your
Next Interview

More Jobs at Pika

Senior Software Engineer, Inference
$130K — $180K *
Palo Alto, CA 94303 (Santa Clara County)
Today
Information Technology
In-Person
Senior Software Engineer, Backend/Infra
$120K — $180K *
Palo Alto, CA 94303 (Santa Clara County)
1 week ago
Information Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
1 week ago
Network Engineer 2
$85K — $110K *
Columbia Technology Partners
Annapolis, MD 21401 (Anne Arundel County)
Today
Software Engineer, Backend
$100K — $150K *
Beacon AI, Inc
San Carlos, CA 94070 (San Mateo County)
Today
Staff Engineer, Design Verification
$115K — $170K *
Marvell Technology
Morrisville, NC 27560 (Wake County)
Reposted Today
Software Engineering Team Lead
$130K — $160K *
Media.Monks
Toronto, ON M3C 0E3
Today

Find similar Senior Software Engineer, Inference jobs:

Nationwide Palo Alto, CA

Senior Software Engineer, Inference

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior Software Engineer, Inference jobs:

Get Ready For Your
Next Interview