Research Scientist, Foundational Data Science

Prior Labs

$120K — $150K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Extensive experience solving diverse data-science problems with a focus on broad task performance.
  • Proficient in a variety of machine learning techniques, including gradient-boosted trees and deep learning.
  • In-depth knowledge of dataset defects and their implications on training and benchmarking.
  • Passionate about foundational data work, valuing datasets and benchmarks as crucial assets.
  • Ability to excel as a senior individual contributor in a flexible, early-stage environment.

Responsibilities

  • Invent and develop frontier tools to enhance TabPFN's capabilities and adaptability.
  • Set the research direction by identifying valuable model capabilities and benchmarks.
  • Incorporate external research and customer insights to guide model and tooling innovations.
  • Establish reliable benchmarks based on real-world structured data to ensure practical performance optimization.
  • Implement baseline models to provide standards for comparison and identify areas for improvement.
  • Create scalable automated pipelines integrating human oversight for rigorous data handling.

Benefits

  • Work with a small, ambitious team focused on solving complex AI challenges.
  • Collaborate closely with top-tier researchers dedicated to craft quality and impact.
  • Enjoy a fast-paced environment that values rigorous thinking and meticulous execution.
  • Regular opportunities for team offsites to foster collaboration and celebrate achievements.
  • Flexible work locations, primarily based in Berlin, Freiburg, or New York, with occasional remote options.
Full Job Description
What you'll do

This role is foundational data science: building the foundations of tabular foundation models so a single model can solve data-science problems across the board. Roughly half the work is inventing new frontier tools for TFMs, and half is building the dataset and benchmark bedrock they stand on.
  • Invent and build the frontier tools that extend TabPFN, including its thinking, scaling, and agentic capabilities, and the new methods that let one model generalize across the full landscape of data-science problems. This is the most open-ended part of the work and grows over time.
  • Set the research direction by deciding which model capabilities and benchmarks are worth pursuing, choosing what is worth solving rather than optimizing a score someone else set.
  • Bring in external research and real customer needs to shape new model and tooling directions, and publish frontier results that move the field forward.
  • Build trustworthy benchmarks from the structured data behind real, high-impact problems, so the team optimizes for real-world performance rather than one leaderboard.
  • Faithfully implement the baselines and competitor models that set the gold standard of applied data science, giving the team a read on where TabPFN leads and where there is room to improve.
  • Build an automated, agentic pipeline with a human in the loop so this data and benchmark foundation scales to far larger volumes without losing rigor, itself a genuinely new tool.


What we're looking for
  • You have solved data-science problems across many domains and datasets to a high standard, optimizing for strong performance across a whole suite of tasks rather than the single best score on one.
  • You work undogmatically across the ML toolbox, including getting strong results with gradient-boosted trees (such as XGBoost) and not only with deep learning.
  • You understand the common categories of dataset defects (leakage, label noise, distribution shift, duplication, mislabeled targets, and similar) and why each corrupts a training or benchmark signal.
  • You are energized by foundational work, valuing the dataset and benchmark bedrock as much as the frontier tooling, and you have taken on hard problems others passed over.
  • You thrive as a senior individual contributor in an ambiguous, early-stage, low-process environment. You are opinionated on best practice in Data Science and can make good judgement calls on approaches to complex problems.


Nice to have
  • Experience building or extending evaluation harnesses, benchmark suites, or experiment frameworks that others rely on.
  • Experience building LLM- or agent-assisted pipelines with a human in the loop to scale a previously manual workflow.
  • Experience acting as the link between external research or customer needs and an internal model or product roadmap.
  • Prior work on tabular, structured-data, or foundation-model problems, or helping shape an emerging research subfield through community work.


Life at Prior Labs
We're a small, ambitious team solving one of the hardest problems in AI, and we're just getting started. You'll work closely with world-class researchers and builders who care deeply about the quality of their craft, the impact of their work, and the people they work with.

We move fast, we think rigorously, and we take the time to do things right. If you're excited by hard problems, motivated by real-world impact, and want to be part of building something that matters, we'd love to hear from you.

We're building our teams in Berlin, Freiburg, and New York and we believe that when you're working on something as hard and exciting as TabPFN, being in the same room matters. Most of our roles are based in one of our offices but great people come from everywhere, and in exceptional cases we're open to remote. This usually involves frequent travel to one of our offices and the whole company comes together regularly for offsites to think, build, and celebrate together.

Similar Jobs

More Jobs at Prior Labs

More Information Technology Jobs

Find similar Research Scientist, Foundational Data Science jobs: