Full-stack Software Engineer (Research team)

Apollo Research

• $135K — $270K *

San Francisco, CA 94112In-Person

Information Technology

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years of professional software engineering experience
Production-quality Python and React coding skills
Experience leading successful software projects over extended periods
Familiarity with building tech stacks in startups
Engagement with popular open-source tools or libraries
Demonstrated experience in competitive programming or relevant competitions

Responsibilities

Develop tools for analyzing model evaluation results
Balance rapid development with creating robust software
Lead major feature development from idea to implementation
Support the full user journey from evaluation to reporting
Ensure software is configurable and adaptable for user needs
Define the software roadmap collaboratively with team members
Work closely with researchers to address their challenges

Benefits

Flexible work hours and work-from-home arrangements
Unlimited vacation and sick leave
Up to 6 months of paid parental leave
Comprehensive health, dental, and vision insurance
Retirement savings with competitive employer matching
Daily meals provided on workdays
$1,000 yearly budget for professional development
Opportunities for paid work trips including conferences and retreats

Full Job Description

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

We're looking for Full-stack Software Engineers who are excited to build tools for frontier AGI safety research, e.g. building and maintaining evals libraries and tools for monitoring and controlling our own LLM traffic.

REPRESENTATIVE PROJECTS

Your main objective is to develop tooling for analyzing model evaluation results. Here is a list of features that you might build and ship in your first 6 months:

- LLM-powered search that finds interesting fragments in evaluation transcripts

- Comparison views that show how conversations and scores differ between two evaluation runs

- Ability to view and analyse conversations with coding agents (Cursor, Claude Code, etc.) in addition to evaluation transcripts

- Results streaming for evaluations that are currently being run

- Collaborative editing of evaluation logs that automatically updates metrics and other derived data.

Think of this as developing an "IDE for evaluations".

Besides this, here are example auxiliary projects which you might do:

- Automated evaluation pipelines to minimize the time from getting access to a new model for pre-deployment testing to analyzing the most important results and sharing them.

- LLM agents and MCP tools to automate internal software engineering and research tasks, with sandboxes to prevent major failures

- Telemetry API and instrumentation of our existing tools, allowing us to monitor usage and improve reliability

- Upstream improvements to the Inspect framework and ecosystem, e.g. support for evaluating modern agentic scaffolds.

KEY RESPONSIBILITIES

Balance between moving quickly and creating robust and performant software
Lead the development of major features from ideation to implementation
Support the entire user journey from running the evaluation to finding interesting results to analysing the results to producing reports and papers
Make the software configurable and extensible, so that users can adapt it for their needs
Collaboratively define and shape the software roadmap and priorities
Establish and advocate for good software design practices, codebase health, and coding agent practices
Work closely with researchers to understand what challenges they face
Work closely with the product team to create solutions that satisfies both our researchers and external customers

KEY REQUIREMENTS

You must have experience writing production-quality Python and React code
We value candidates from diverse backgrounds and recognise that candidates may demonstrate their skills in different ways.

For example, we might be impressed if you have:

Led the development of a successful software tool or product over an extended period (e.g. 1 year or more)
Started and built the tech stack for a company, e.g in a start-up
Worked your way up in a large organisation, repeatedly gaining more responsibility and influencing a large part of the codebase
Authored and/or maintained a popular open-source tool or library
Placed in a prestigious programming competition (IOI, ICPC, etc.)
5+ years of professional software engineering experience

The following would be a bonus:

Experience designing rich and intuitive UIs, especially for power usersDirect work with researchers or customers
Experience working with LLM agents or LLM evaluations
Interest in AI Safety

We want to emphasize that people who feel they don't fulfill all of these characteristics but think they would be a good fit for the position nonetheless are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine.

LOGISTICS

Time Allocation: Full-time
Location: This is an in-person role working out of our London or San Francisco office. We offer flexible working hours and wfh arrangements.
Visa sponsorship: We sponsor visas in both the UK and US. Sponsorship isn't guaranteed for every role or candidate, but if we make you an offer, we'll work with you to find the right visa route.

BENEFITS

This role offers market competitive salary, equity, and competitive benefits.
Salary: 100k - 200k GBP (~135k - 270k USD)
Flexible work hours and schedule
Unlimited vacation
Unlimited sick leave
Up to 6 months of paid parental leave
Comprehensive health, dental and vision insurance
Retirement savings with competitive employer matching (e.g. 401(k) for US employees)
Lunch, dinner, and snacks are provided for all employees on workdays
Paid work trips, including staff retreats, business trips, and relevant conferences
A yearly $1,000 (USD) professional development budget

ABOUT THE TEAM

The SWE team currently consists of Rusheb Shah, Andrei Matveiakin, Alex Kedrik, and Glen Rodgers. Beyond the SWE team, you will closely interact with the research scientists and engineers as the primary user group of your tools. You can find our full team here.

INTERVIEW PROCESS

Please complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.

About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 2 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no leetcode-style general coding interviews. If you want to prepare for the interviews, we suggest working on hands-on LLM evals projects (e.g. as suggested in our starter guide), such as building LM agent evaluations in Inspect.

* Ladders Estimates

Similar Jobs

Forward Deployed Engineer (Copy)
$130K — $215K *
Gem
Redwood City, CA 94061 (San Mateo County)
Today
Forward Deployed Engineer (Copy)
$130K — $215K *
Gem
San Francisco, CA 94112 (San Francisco County)
Today
Product Engineer
$115K — $150K *
Workhelix
San Francisco, CA 94112 (San Francisco County)
Today
Senior Full Stack Engineer, Premium
$150K — $185K *
Rocket Money
San Francisco, CA 94112 (San Francisco County)
Today
Senior Full Stack Engineer, Premium
$150K — $185K *
Rocket Money
Remote
Today
Senior Full-Stack Software Engineer, Social Commerce
$196K — $220K *
Discord
San Francisco, CA 94112 (San Francisco County)
Today

Get Ready For Your
Next Interview

More Jobs at Apollo Research

Full-stack Software Engineer (Research team)
$135K — $270K *
San Francisco, CA 94112 (San Francisco County)
Today
Information Technology
In-Person
Full-stack Engineer (Product)
$135K — $245K *
San Francisco, CA 94112 (San Francisco County)
Today
Consumer Technology
In-Person
Governance Researcher (Expression of Interest)
$100K — $150K *
San Francisco, CA 94112 (San Francisco County)
2 days ago
Education, Government & Non-Profit
In-Person
Research Scientist/Engineer (Science of Scheming)
$150K — $270K *
San Francisco, CA 94112 (San Francisco County)
3 days ago
Consumer Technology
In-Person
Backend Software Engineer (Research team)
$135K — $270K *
San Francisco, CA 94112 (San Francisco County)
3 days ago
Consumer Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
1 week ago
Network Engineer 2
$85K — $110K *
Columbia Technology Partners
Annapolis, MD 21401 (Anne Arundel County)
Today
Software Engineer, Backend
$100K — $150K *
Beacon AI, Inc
San Carlos, CA 94070 (San Mateo County)
Today
Staff Engineer, Design Verification
$115K — $170K *
Marvell Technology
Morrisville, NC 27560 (Wake County)
Reposted Today
Software Engineering Team Lead
$130K — $160K *
Media.Monks
Toronto, ON M3C 0E3
Today

Find similar Full-stack Software Engineer (Research team) jobs:

Nationwide San Francisco, CA

Full-stack Software Engineer (Research team)

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Full-stack Software Engineer (Research team) jobs:

Get Ready For Your
Next Interview