Scientific Evals

Edison Scientific

• $130K — $150K *

San Francisco, CA 94112In-Person

Pharmaceuticals & Biotech

Less than 5 years of experience

More than 3 months ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Graduate-level training in biology, biochemistry, computational biology, or related field with research experience.
Knowledge of machine learning concepts, especially deep learning and large language models.
Proficiency in Python for data processing and analysis.
Ability to discern expert scientific reasoning from basic pattern recognition.
Detail-oriented with willingness to handle meticulous tasks when necessary.
Creative problem-solver comfortable with ambiguity and open-ended challenges.
Strong organizational and communication skills for managing multiple projects.

Responsibilities

Design benchmarks reflecting the complexities of biological research.
Curate and validate biological datasets for scientific rigor.
Analyze model outputs to identify issues and enhance datasets and evaluation methods.
Work with AI/ML researchers to translate scientific concepts into effective training signals.
Manage workflows and coordinate with domain experts while tracking project progress.

Benefits

Collaborative, on-site work environment that fosters creativity and team energy.
Exposure to cutting-edge AI applications in scientific research.
Opportunities for professional growth in a pioneering technology sector.

Full Job Description

About

Edison Scientific focuses on building and commercializing AI agents for science, and shares FutureHouse's mission to build an AI Scientist- scaling autonomous research, productizing it, and applying it to critical challenges such as drug development.
Role

We are seeking an ambitious, scientifically grounded person to join our team focused on developing rigorous benchmarks and training datasets that advance AI capabilities in biology. This role sits at the intersection of biology, data curation, and machine learning, and is ideal for someone with deep scientific training who is excited to shape how frontier AI systems learn to do science.
Responsibilities

Design benchmarks that capture the complexity of real biological research, drawing on your domain expertise to identify what makes scientific reasoning hard. This will include open-ended scientific benchmarks and building on prior work like LAB-Bench and BixBench.
Curate and vet biological datasets to ensure scientific rigor.
Analyze model outputs, identify failure modes, and contribute to iterative improvements in both datasets and evaluation criteria.
Collaborate with AI/ML researchers to translate scientific intuition into training signal, helping AI systems learn not just facts but how scientists think.
Coordinate operations and manage workflows, including working with domain experts, tracking task progress, and maintaining documentation.

Qualifications

Have graduate-level training in biology, biochemistry, computational biology, or a related field, with hands-on research experience.
Have working knowledge of machine learning concepts, particularly deep learning and large language models.
Are comfortable with Python and can build workflows for data processing, analysis, and experimentation.
Possess strong scientific taste and can identify what distinguishes expert-level reasoning from surface-level pattern matching.
Are detail-oriented and willing to take on high-value but occasionally tedious work.
Are energized by ambiguous, open-ended problems that require creativity, collaboration, and first-principles thinking to solve.
Are organized and communicative, able to manage multiple workstreams and coordinate across teams.

Bonus points for:

Prior experience creating evaluation datasets, annotation guidelines, or working on human-in-the-loop data pipelines.
Experience with bioinformatics pipelines, biological databases, or sequence analysis tools.
Hands-on experience fine-tuning or evaluating large language models, or familiarity with RLHF and preference-based training.
Publications or research experience in areas relevant to AI for science.

Location + Compensation

Collaboration is at the heart of discovery. We work on-site to stay close to the science, move faster as a team, and share the kind of energy that only happens when smart, curious people build together- in a space that we love to be in!
- Location: San Francisco (Dogpatch)
At Edison Scientific, we know that titles can cover a range of experience levels. Actual base pay will depend on factors such as skills, experience, and scope of responsibility. Compensation ranges may evolve as we continue to grow. In addition to base pay, team members may be eligible for equity, benefits, and other perks.
- Compensation: $130,000+ (pending experience)

* Ladders Estimates

Similar Jobs