About the RoleAt LatchBio, we build the benchmarks that frontier AI labs use to evaluate and train models on biological reasoning. SurveillanceBench tests whether AI agents can execute real pathogen surveillance and epidemiological workflows: wastewater sequencing, air sampling, metagenomic analysis, with the speed and precision required for outbreak detection and public health response.
We're looking for scientists with deep hands-on experience in running computational workflows for infectious disease surveillance, molecular epidemiology, wastewater genomics, environmental metagenomics, outbreak response, and pathogen detection. Your role is to design evaluation tasks grounded in real surveillance science: tasks that test whether AI agents can replicate the workflows epidemiologists and surveillance bioinformaticians execute in the field.
You will review real-world surveillance workflows, datasets, and research papers to establish ground truth for AI evaluations. You'll translate tacit knowledge about what constitutes good threat detection into structured evaluation tasks. Your job is to define success criteria, justify grading decisions, and help build a corpus of surveillance evaluations that frontier labs can use to train models on real epidemiological reasoning.
What You'll Do- Review surveillance workflows and real case studies to define what good threat detection looks like
- Source examples from epidemiology literature, wastewater programs, and outbreak response
- Convert those into structured evaluation tasks with clear success criteria
- Document your reasoning so evaluations are defensible and consistent
- Test AI agents against your tasks: do they interpret samples correctly, chain tools appropriately, reach sound conclusions?
- Build a corpus of surveillance evaluations grounded in real workflows
What We're Looking For- Hands-on computational surveillance experience. You've run wastewater, air, or clinical metagenomic pipelines. You know taxonomic classification, assembly, variant calling, and phylogenetics. You've worked with complex environmental samples and understand how sample type and tool choice affect results.
- Threat detection judgment. You can distinguish real threats from false positives. You understand sensitivity/specificity tradeoffs and can define success criteria for detection work.
- Task construction. You can translate domain knowledge into structured evaluation tasks with clear rubrics.
- Sample type literacy. You understand wastewater vs. air vs. clinical vs. field sequencing. You know how tool choice and parameters differ across contexts.
- Pace. You work fast. Government background is fine if you were a doer during crises (pandemic, outbreak response, surveillance scale-up).
Nice to Have- Network in the field (can refer 5-10 experts)
- Agentic workflow intuition (understand how AI agents chain tools, where they might fail)
- Kill chain thinking (how individual capabilities compose into broader insights)
- Field sequencing experience (MinION, mobile labs, real-time analysis)
- Government biodefense background (BioWatch, SOCOM, NMRC BDRD)
- Frontier AI eval frameworks familiarity
- Publications in surveillance genomics or wastewater epidemiology
Ideal Backgrounds- Government surveillance bioinformatician (CDC, state biodefense labs, military)
- Field sequencing specialist (deployed sequencers, mobile labs, real-time analysis)
- Environmental genomics researcher (wastewater/air/clinical metagenomic programs)
- Outbreak response epidemiologist (CDC, state health departments, field experience)
Compensation & Logistics- Salary: $120k-$180k (base + performance pay that scales with output) (CA)
- Produce 50 evals/week at baseline, exceed that and performance pay scales linearly
- Equity
- 100% premium-covered Blue Shield Platinum health plan ($0/$0)
- Unlimited PTO
- Visa Sponsorship
Hiring Process- Intro call with Saul (Technical Recruiter)
- Technical interview with Harmon (Pod Lead)
- Cultural interview with Jordan (Chief of Staff)
- Offer
Location: San Francisco, CA. In-person.