Senior Biological Data Architect

Rancho BioSciences

• $145K — $187K *

US-AnywhereRemote in United States

Pharmaceuticals & Biotech

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

PhD in Life Sciences or equivalent experience in biomedical data.
Expertise in conceptual, logical, and canonical data modeling for complex biomedical domains.
Experience with schema modeling frameworks like LinkML.
Familiarity with YAML-based schema authoring.
Understanding of FAIR principles and persistent identifiers.
Knowledge of biomedical ontologies and controlled vocabularies.
Experience with semantic web technologies such as RDF and SPARQL.
Proficiency in Python, R, or SQL for data validation and conformance testing.

Responsibilities

Collaborate with stakeholders to develop comprehensive biomedical data models.
Lead harmonization efforts including vocabulary alignment and provenance capture.
Define schemas and controlled vocabularies in partnership with engineering teams.
Design models to support data pipelines and analytical workflows.
Establish data quality validation rules and checks.
Manage schema lifecycle including version control and documentation.
Facilitate schema review and identify modeling risks early.

Benefits

Fully remote work environment with flexible location options.
Opportunities to collaborate with leading pharmaceutical organizations.
Engagement in impactful projects that contribute to human health.
Work within a diverse team of experts across various scientific fields.

Full Job Description

About the role

We are seeking a full-time contractor for a Senior Biological Data Architect to design, harmonize, and govern complex biomedical data models on behalf of our pharmaceutical, academic, and institutional clients. The successful candidate will be an expert problem solver with deep expertise in conceptual, logical, and canonical data modeling for biomedical and scientific domains, including disease biology, genetics, translational research, and drug development. You will play a central role in client initiatives that deliver FAIR-aligned data products enabling rapid query and decision-making by R&D scientists.
We are a Data Curation company collaborating with some of the most renowned pharmaceutical organizations in the world. Our team of scientists, curators, computational biologists, data scientists, knowledge engineers, and solution developers is distributed across the country; we support talented people living where they choose, working collaboratively on projects that have real impact on human health.
While fully remote, candidates will be expected to spend the majority of time overlapping East Coast US or UK working hours.

What you'll do

Partner with scientific and technical stakeholders to elicit requirements and propose canonical data models that represent the full breadth of biomedical concepts relevant to target discovery, disease understanding, and translational research, along with the evidence and provenance that support them.
Design and lead source-to-canonical harmonization activities, covering vocabulary alignment, persistent identifier assignment, and lineage and provenance capture.
Define schemas, controlled vocabularies, identifier strategies, and ontology bindings in collaboration with knowledge engineering, curation, data engineering, and platform teams.
Design models that power data pipelines, APIs, knowledge graphs, analytical workflows, and downstream R&D query use cases.
Establish validation rules and data quality checks covering ontology term validation, range and cardinality checks, required-field enforcement, ID and label consistency, cross-field consistency, and provenance completeness.
Manage the full schema lifecycle: repository management (e.g., GitHub-based), semantic versioning, changelogs, tagged releases, data dictionaries, metadata catalogs, and downstream impact assessments.
Drive schema review, approval, and publication processes; identify modeling risks early, such as metadata gaps, ontology conflicts, source data quality issues, lineage gaps, and compatibility risks.
Lead modeling strategy spanning harmonization, pipeline validation, knowledge graphs, and FAIR data product delivery.
Translate ambiguous scientific requirements into clear, durable canonical models and make defensible, documented decisions on ontology reuse, extension, and mapping.
Design modular, reusable, future-proof models aligned with FAIR and enterprise standards, with consistent persistent identifier and provenance conventions across data assets.
Communicate strategies, trade-offs, and progress clearly to clients and internal teams.

Qualifications

Required:

PhD in Life Sciences (or equivalent demonstrated expertise) with first-hand experience working with biomedical or research data.
Strong conceptual, logical, and canonical data modeling experience for complex scientific or biomedical domains.
Hands-on experience with LinkML or equivalent schema modeling frameworks, comfortable defining classes, slots, ranges, identifiers, required fields, constraints, cardinality, descriptions, and ontology bindings.
Working knowledge of YAML-based schema authoring.
Solid grasp of FAIR principles (findability, accessibility, interoperability, reusability), including persistent identifiers, metadata standards, provenance, and schema versioning.
Experience with biomedical ontologies and controlled vocabularies, including familiarity with public ontology resources covering genes, diseases, phenotypes, anatomy, cell types, assays, units, and evidence.
Familiarity with semantic web technologies such as RDF, OWL, JSON-LD, SHACL, ShEx, and SPARQL, and with knowledge graph modeling.
Proven experience designing Entity Relationship Diagrams and Conceptual and Logical Data Models.
Experience with schema and model registries, data catalogs, metadata registries, and data dictionary management.
Proficiency in Python, R, or SQL for model conformance testing, ontology mapping, or data quality validation (notebook-based workflows a plus).
Experience with SDLC methodologies, unit and integration testing, and documentation practices.
AI awareness: comfortable evaluating how and AI-driven curation and mapping tools can accelerate modeling, harmonization, and validation workflows.

Nice to Have:

Experience working with modern cloud data platforms and data lake environments such as Snowflake or Databricks.
Hands-on use of AI-powered coding assistants and established collaboration workflows that incorporate them into day-to-day modeling, documentation, or validation work.

The pay range for this role is:

70 - 90 USD per hour (United States)

* Ladders Estimates

Similar Jobs

Genome Analyst - Division of Genome Diagnostics, Dept. of Pathology and Laboratory Medicine - BC Children's Hospital & BC Women's Hospital
$108K — $155K *
PHSA
Vancouver, BC V5K 5J9
Today
Associate Manager, Statistical Genetics
$109K — $179K *
Regeneron Pharmaceuticals, Inc
Tarrytown, NY 10591 (Westchester County)
Today
Genome Analyst - Division of Genome Diagnostics, Dept. of Pathology and Laboratory Medicine - BC Children's Hospital & BC Women's Hospital
$108K — $155K *
PHSA
Vancouver, BC V5K 5J9
Yesterday
Scientist, Bioinformatics
$86K — $153K *
Bruker
Remote
Reposted 2 days ago
Bioinformatics Engineer
$125K — $150K *
Axle Informatics
Rockville, MD 20850 (Montgomery County)
2 days ago
Associate Director in Data Science, Digital Endpoints Pharma R & D (Billerica MA)
$131K — $249K *
Merck Group
Billerica, MA 01821 (Middlesex County)
3 days ago

Get Ready For Your
Next Interview

More Jobs at Rancho BioSciences

Senior Biological Data Architect
$145K — $187K *
Remote
Today
Pharmaceuticals & Biotech
Remote in United States
Growth Marketing Specialist
$80K — $90K *
Remote
Today
Pharmaceuticals & Biotech
Remote in United States
R&D Consultant/Consulting Associate
$70K — $110K *
Remote
Today
Pharmaceuticals & Biotech
Remote in United States
R&D Consulting Principal
$120K — $153K *
Remote
Today
Pharmaceuticals & Biotech
Remote in United States

More Pharmaceuticals & Biotech Jobs

Pharmacist In Charge
$136K — $177K *
InnovAge
Boulder, CO 80301 (Boulder County)
Today
Director, Pharmacy Site Operations
$120K — $150K *
InnovAge
Boulder, CO 80301 (Boulder County)
Today
Senior Manager Quality Compliance (Rare Disease)
$132K — $178K *
Amgen Inc
Remote
Today
Account Manager/Specialty Account Manager - TAVNEOS - Reading, PA
$145K — $215K *
Amgen Inc
Reading, PA 19606 (Berks County)
Today
Manager, New Product Development
$92K — $110K *
Antylia Scientific
Buffalo, NY 14221 (Erie County)
Today

Find similar Senior Biological Data Architect jobs:

Nationwide Remote

Senior Biological Data Architect

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Senior Biological Data Architect jobs:

Get Ready For Your
Next Interview