Senior Data Scientist

Probably Genetic

• $130K — $180K *

San Francisco, CA 94112In-Person

Healthcare

5 - 7 years of experience

1 month ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

7+ years of experience in data science or machine learning engineering
Strong proficiency in Python and core data science tools like pandas, NumPy, scikit-learn, PySpark, and SQL
Demonstrated experience in end-to-end machine learning processes from problem definition to monitoring
Familiarity with NLP techniques and practical application of language models
Comfortable with prompt engineering and external AI API performance evaluation
Ability to operate with high ownership in fast-paced, lean environments
Strong analytical communication skills for translating complex data insights to diverse audiences

Responsibilities

Own the complete development and deployment of PG's predictive diagnostic AI models
Run prospective testing experiments to continually enhance model performance
Build and maintain a synthetic patient data pipeline for research and model development
Optimize patient intake experience using NLP and data analysis
Manage API usage and cost optimization across PG's AI ecosystem
Conduct strategic analyses to inform product and program insights
Establish MLOps infrastructure for model monitoring and operational processes
Engage in blue sky initiatives focused on extracting value from data

Benefits

An engaging and supportive team focused on improving lives
Fair compensation with competitive early-stage equity grants
Generous flexible time off policy that is actively utilized
12 weeks of parental leave for all eligible employees
Hybrid, flexible work environment promoting autonomy
A pet-friendly office located in Downtown SF near transit
Work from anywhere policy allowing up to 4 weeks per year
Regular team retreats to exciting destinations
Comprehensive health benefits including medical, dental, vision, therapy, FSA, and 401k

Full Job Description

About the role

We are looking for a Senior Data Scientist who will own some of the most consequential diagnostic AI in rare disease: building, validating, and operationalizing the models that help us find and diagnose patients who have never had a name for their disease, powering the analytical rigor behind our testing programs, and shaping how we use data to make smarter product decisions.

What you will do

Own the end-to-end development, validation, and operationalization of PG's predictive diagnostic AI models - from feature engineering through production deployment - that power program eligibility decisions and clinical decisions for patients
Run prospective testing experiments: apply diagnostic models to undiagnosed patients, coordinate testing, and track outcomes to continuously improve model performance
Build and maintain PG's synthetic patient data pipeline, a critical deliverable for our research programs, and key input to our own model development lifecycle
Optimize our patient intake experience using NLP and multimodal data analysis to determine which questions to ask, in what order, to maximize data quality and conversion
Own API usage and cost optimization across PG's AI stack, including prompt engineering, model evaluation, and ongoing performance monitoring
Conduct ad hoc strategic analyses that inform product prioritization, causality assessment, and generate customer-facing program insights
Establish MLOps infrastructure: model monitoring, drift detection, API observability, and lightweight but durable operational processes
Have the freedom to conduct blue sky research initiatives aimed at creating value from our data
Work with Data Engineering to build a robust, scalable data foundation that supports all of the above

Who you are

We are looking for a few specific things that will help you succeed in this role:

7+ years of experience in data science, machine learning engineering, or a closely related field
Strong Python proficiency and fluency across the core data science stack: pandas, NumPy, scikit-learn, PySpark, and SQL
Demonstrated end-to-end ML experience: you have taken models from problem definition through feature engineering, validation, deployment, and monitoring in a production environment
Experience with NLP techniques and applying language models to real-world problems
Comfort with prompt engineering and evaluating external AI API performance (e.g., OpenAI)
A track record of operating with high ownership in lean, fast-moving environments where you have had to build structure as much as execute within it
Strong analytical communication skills - you can translate complex model outputs and data findings into clear, actionable narratives for technical and non-technical audiences alike

Some things that are not required, but you will learn on the job:

Experience with Databricks or similar lakehouse/ML platform environments
Familiarity with synthetic data generation techniques
Domain knowledge in healthcare, rare disease, genomics, or clinical research
Experience with MLOps tooling and building observability infrastructure from scratch
Exposure to biopharma or insurance analytics use cases

What we offer at Probably Genetic:

An engaging and supportive team all on a mission to improve lives
Fair and equitable compensation with competitive early-stage equity grants
Generous Flexible Time off policy, that we actually use
Parental Leave Benefits (12 weeks for both birthing and non-birthing)
Hybrid, flexible work with high-trust and autonomy
A bright, inviting, pet-friendly office in Downtown SF near transit
A "work from anywhere" policy, up to 4 weeks a year
Regular team retreats in exciting destinations
Health Benefits including medical, dental, vision, therapy, FSA, and 401k
And so much more!

* Ladders Estimates

Similar Jobs

Business Data Scientist, YouTube+, GBO Product Finance
$138K — $198K *
Google
San Bruno, CA 94066 (San Mateo County)
Today
Senior Data Scientist
$124K — $271K *
Zoom Video Communications, Inc.
San Jose, CA 95123 (Santa Clara County)
Reposted Today
Senior Data Scientist
$163K — $236K *
Adobe Inc.
San Jose, CA 95123 (Santa Clara County)
Reposted Yesterday
Senior Data Scientist
$163K — $236K *
Adobe Inc.
San Francisco, CA 94112 (San Francisco County)
Reposted Yesterday
Associate Director, RWE Statistics Programming
$130K — $160K *
Incyte
Remote
Reposted Yesterday
Associate Director, Advanced Analytics
$155K — $213K *
Pacira BioSciences, Inc.
Brisbane, CA 94005 (San Mateo County)
Reposted Yesterday

Get Ready For Your
Next Interview

More Jobs at Probably Genetic

Community Marketing Manager
$95K — $120K *
San Francisco, CA 94112 (San Francisco County)
4 days ago
Healthcare
In-Person
Forward Deployed Product Manager
$155K — $195K *
San Francisco, CA 94112 (San Francisco County)
1 week ago
Pharmaceuticals & Biotech
In-Person
Patient Experience Specialist
$80K — $100K *
San Francisco, CA 94112 (San Francisco County)
1 month ago
Healthcare
In-Person
Solutions Architect
$155K — $195K *
San Francisco, CA 94112 (San Francisco County)
1 month ago
Pharmaceuticals & Biotech
In-Person
Senior Data Scientist
$130K — $180K *
San Francisco, CA 94112 (San Francisco County)
1 month ago
Healthcare
In-Person

More Healthcare Jobs

Speech Language Pathologist School
Confidential Company
Houston, TX 77002 (Harris County)
Reposted Today
LNHA – Business Development Sales
$110K — $130K *
American Physiatry
Orlando, FL 32801 (Orange County)
Reposted Yesterday
Controller
$80K — $150K + bonus additional *
The Vernon Staffing Group
Paris, TX 75460 (Lamar County)
5 days ago
CT Technologist
$85K — $132K *
Southcoast Health System
New Bedford, MA 02740 (Bristol County)
Today
Dental Hygienist
$120K — $127K *
Smile Brands
San Fernando, CA 91340 (Los Angeles County)
Today

Find similar Senior Data Scientist jobs:

Nationwide San Francisco, CA

Senior Data Scientist

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior Data Scientist jobs:

Get Ready For Your
Next Interview