About the roleWe're looking for talented researchers currently enrolled in MS / PhD programs to collaborate on a research project focused around
frontier benchmarks and environments for long-horizon AI agents. This will require 1) identifying failure modes in frontier models, 2) developing rigorous benchmarks that evaluate how well frontier agents perform on complex, realistic tasks requiring long-horizon reasoning and tool use in dynamic environments, and 3) training autonomous agents that can reason, plan, and act over extended time horizons.
We can accommodate full-time or part-time engagements. The goal of the residency is to
culminate in a publication, and if there is a mutual fit, transition into a full-time role. If you're interested in joining Polymath but are not currently a student, please apply to the Member of Technical Staff role.
You'll be a good fit if you:- Are currently pursuing an MS or PhD program in Computer Science or a related field
- Have experience with reinforcement learning, benchmarking frontier models, or model post-training
- Have experience with systems engineering and can write production-quality code
- Have a strong track record of publications
- Have high agency, move quickly, and enjoy working on open-ended research problems