Research Scientist/Engineer (Science of Scheming)

Apollo Research

• $150K — $270K *

San Francisco, CA 94112In-Person

Consumer Technology

Less than 5 years of experience

5 days ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5-7 years experience in empirical research or AI scheming
Strong software engineering skills, especially in Python
Hands-on experience in reinforcement learning training for LLMs
Deep understanding of AI cognition and related literature
Proven analytical skills in quantitative fields
Interest in keeping up with the latest AI advancements
Ability to translate complex thoughts into experimental proposals

Responsibilities

Collaborate with top AI developers across various labs
Study reinforcement learning dynamics and their implications
Build frameworks to predict organizational scheming risks as models scale
Develop innovative evaluation techniques for evaluation-aware models
Investigate unique patterns in the reasoning of advanced AI systems

Benefits

Flexible work hours and schedule
Unlimited vacation and sick leave
Up to 6 months of paid parental leave
Comprehensive health, dental, and vision insurance
Retirement savings with competitive employer matching
Lunch, dinner, and snacks provided
Yearly $1,000 professional development budget
Relocation support and visa fees covered if applicable

Full Job Description

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

YOU WILL HAVE THE OPPORTUNITY TO

- Collaborate with leading AI developers. We partner with multiple labs, giving you access to a breadth of models that no single AI lab could offer. Through long-term research collaborations, your work directly impacts how the most capable AI systems are built and deployed.

- Deeply study the RL dynamics that lead to the emergence of reward-seeking, evaluation awareness or misaligned preferences. Design and train model organisms, and scale your insights to frontier systems.

- Work towards "Scaling laws of scheming". Build the empirical foundations to predict how scheming risks evolve as models scale in capability.

- Develop novel and ambitious evaluation techniques that have a chance of scaling to highly evaluation aware models.

- Deep dive into AI cognition. Discover patterns in the reasoning processes of frontier AI systems that no one else has ever observed before.

Note: We are not hiring for interpretability roles.

KEY REQUIREMENTS

A diverse range of skill sets will be required to drive our research agenda forward and we don't expect any single candidate to fulfill all the characteristics below. That being said, a successful candidate likely displays excellence at one or several of the following:

- Fast-paced empirical research: You can design and execute experiments. You always strive to speed up iteration cycles and relentlessly drive progress towards the next empirical milestone.

- Conceptual insights about scheming: You have deeply thought about the problem of AI scheming and are familiar with all the relevant literature. You are able to turn vague and undefined concepts into concrete and insightful experiment proposals.

- Software engineering skills: Strong software engineering skills correlate highly with effective execution, even in an era of AI agents. Our entire stack uses Python.

- Intense interest in AI progress: You always stay up to date on the latest model releases, and continuously tinker with new and creative AI workflows to speed up your work. You are fascinated by AI cognition and actively spend time trying to understand how they think.

- Experience RL-training LLMs: You have hands-on experience in training LLMs via reinforcement learning. You have encountered and resolved countless painful issues from GPU failures to debugging learning instabilities.

- Strong analytical skills: You bring rigorous quantitative chops from working on fields such as scaling laws in LLMs, statistical physics, dynamical systems, applied statistics etc. You're comfortable building mathematical models of empirical phenomena and know how to extract signal from noisy data.

We want to emphasize that people who feel they don't fulfill all of these characteristics but think they would be a good fit for the position, nonetheless, are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine. We don't require a formal background or industry experience and welcome self-taught candidates.

BENEFITS

This role offers market competitive salary, equity, and competitive benefits.
Salary: 100k - 200k GBP (~150k - 270k USD)
Flexible work hours and schedule
Unlimited vacation
Unlimited sick leave
Up to 6 months of paid parental leave
Comprehensive health, dental and vision insurance
Retirement savings with competitive employer matching (e.g. 401(k) for US employees)
Lunch, dinner, and snacks are provided for all employees on workdays
Paid work trips, including staff retreats, business trips, and relevant conferences
A yearly $1,000 (USD) professional development budget
Relocation support and visa fees (if applicable)

LOGISTICS

Time Allocation: Full-time
Location: This is an in-person role working out of our London or San Francisco office. We offer flexible working hours and wfh arrangements.
Visa sponsorship: We sponsor visas in both the UK and US. Sponsorship isn't guaranteed for every role or candidate, but if we make you an offer, we'll work with you to find the right visa route.

ABOUT THE TEAM

The current evals team consists of Jérémy Scheurer,Alex Meinke,Bronson Schoen, Felix Hofstatter,Axel Højmark,Teun van der Weij,Alex Lloyd and Mia Hopman.Alex Meinke coordinates the research agenda with guidance from Marius Hobbhahn, though team members lead individual projects. You will mostly work with the evals team as well as our team of software engineers, but you will likely sometimes interact with the governance team to translate technical knowledge into concrete recommendations. You can find our full teamhere.

How to apply: Please complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.

About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 2.5 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no LeetCode-style general coding interviews. If you want to prepare for the interviews, we suggest working on hands-on LLM evals projects (e.g. as suggested in our starter guide), such as building LM agent evaluations in Inspect.

* Ladders Estimates

Similar Jobs

Research Scientist Graduate (Global E-commerce Content Recommendation) - 2026 Start (PhD)
$156K — $387K *
TikTok
San Jose, CA 95123 (Santa Clara County)
Reposted Today
Lead Biostatistician - Pharmacovigilance (PV) Focus
$120K — $180K *
Thermo Fisher Scientific
Remote
Today
Hunyuan AIGC Algorithm Researcher (World Model Foundation Direction)
$149K — $279K *
LightSpeed Retail
Palo Alto, CA 94303 (Santa Clara County)
Reposted Today
Scientist, Membrane Protein Biochemistry
$121K — $164K *
Amgen Inc
South San Francisco, CA 94080 (San Mateo County)
Today
Scientist, Functional Genomics
$130K — $170K *
NewLimit
South San Francisco, CA 94080 (San Mateo County)
Today
Research Scientist, Demography and Survey Sciences
$120K — $150K *
Meta
Menlo Park, CA 94025 (San Mateo County)
Today

Get Ready For Your
Next Interview

More Jobs at Apollo Research

Full-stack Software Engineer (Research team)
$135K — $270K *
San Francisco, CA 94112 (San Francisco County)
2 days ago
Information Technology
In-Person
Full-stack Engineer (Product)
$135K — $245K *
San Francisco, CA 94112 (San Francisco County)
2 days ago
Consumer Technology
In-Person
Governance Researcher (Expression of Interest)
$100K — $150K *
San Francisco, CA 94112 (San Francisco County)
4 days ago
Education, Government & Non-Profit
In-Person
Research Scientist/Engineer (Science of Scheming)
$150K — $270K *
San Francisco, CA 94112 (San Francisco County)
5 days ago
Consumer Technology
In-Person
Backend Software Engineer (Research team)
$135K — $270K *
San Francisco, CA 94112 (San Francisco County)
5 days ago
Consumer Technology
In-Person

More Consumer Technology Jobs

Embedded Software Engineer
$69K — $141K *
CACI International
Rochester, NY 14609 (Monroe County)
Reposted Today
Manager, User Experience & Design
$75K — $141K *
Bank of Montreal
Toronto, ON M3C 0E3
Today
Senior Director, AI Product Marketing
$200K — $215K *
RingCentral
Belmont, CA 94002 (San Mateo County)
Today
Manager, Digital Marketing Performance
$85K — $117K *
Medcan
Toronto, ON M3C 0E3
Today
Staff Product Manager (Open to Remote)
$151K — $234K *
Triumph Financial
Remote
Today

Find similar Research Scientist/Engineer (Science of Scheming) jobs:

Nationwide San Francisco, CA

Research Scientist/Engineer (Science of Scheming)

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Research Scientist/Engineer (Science of Scheming) jobs:

Get Ready For Your
Next Interview