Eli Lilly

Advisor - Scientific Data Engineer

Eli Lilly$166K — $266K *
Pharmaceuticals & Biotech
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's degree in Computer Science, Data Engineering, Bioinformatics, or a related field plus 8 years of data engineering experience OR a Master's degree with 5 years of experience.
  • Proven track record of building data pipelines and AI/ML-optimized ETL/ELT workflows.

Responsibilities

  • Design and construct the architecture for harmonizing raw and processed omics data into AI-compatible layers.
  • Develop and optimize automated ETL/ELT pipelines for easy AI consumption.
  • Implement monitoring and validation checks to ensure data quality across multiple layers.
  • Collaborate with teams to expand data harmonization patterns among various data modalities.
  • Create and maintain a semantic layer over multi-omics databases to empower AI systems.

Benefits

  • Participation in a company-sponsored 401(k) plan.
  • Vacation benefits and flexible time-off options.
  • Access to comprehensive medical, dental, and vision plans.
  • Life insurance and death benefits.
  • Well-being programs including fitness benefits and employee assistance activities.
Full Job Description
The Opportunity

We are building something unprecedented — an AI foundation that will push the frontier on what is possible today across drug discovery research, from target identification and disease biology through translational science.

The Applied Intelligence for Discovery (AI4D) team is a newly formed group within Lilly Research Laboratories that operates at the intersection of scientific delivery and core platform development. AI4D’s mission is connecting scientists to petabyte-scale data through natural language interfaces, automated analysis workflows, and intelligent search — and to convert early deployments into repeatable system standards and evaluation practices that scale across therapeutic areas.

As a Scientific Data Engineer, you will close that gap. You will build the semantic layer, data harmonization infrastructure, AI-ready data products, and lakehouse architecture that bridge how data is stored and how AI systems need to consume it. You will be working at the intersection of the data infrastructure team and the generative AI engineers who build the systems scientists interact with.


Responsibilities

Data Harmonization and Lakehouse Architecture
  • Design and build the data architecture that transforms raw and processed omics data into harmonized, AI-consumable layers
  • Build and optimize ETL/ELT pipelines that produce denormalized views, pre-computed aggregations, embedding-ready text representations, and feature stores optimized for AI system consumption
  • Implement data quality monitoring, automated profiling, and validation checks across harmonization layers
  • Create versioned, reproducible data snapshots that support model training, evaluation, and audit requirements in a regulated environment
  • Partner with the teams to extend harmonization patterns as data modalities expand beyond genomics and proteomics into spatial transcriptomics, perturbational data (Perturb-Seq), single-cell, and digital pathology

Semantic Layer and Schema Engineering
  • Design and maintain a semantic layer over Lilly’s multi-omics databases that enables AI systems
  • Create comprehensive schema documentation: table descriptions, column-level annotations, relationship mappings, business logic rules, and domain-specific constraints (e.g., statistical thresholds, unit conventions, experimental design metadata)
  • Develop gold-standard question/SQL pairs for each major database, in collaboration with computational biologists and Generative AI Engineers, to serve as training data, few-shot examples, and evaluation benchmarks
  • Build and maintain a data dictionary and ontology mapping layer that translates how scientists think and speak about data (gene names, pathway terms, assay types) into how the data is physically stored

AI-Ready Data Products
  • Build and manage vector embedding pipelines for scientific documents, study metadata, and structured data descriptions to power RAG-based retrieval
  • Build integration pipelines that connect heterogeneous data sources — omics databases, internal publications, electronic lab notebooks, assay results, and clinical annotations — into a unified, queryable layer
  • Develop and enforce metadata standards that ensure new data sources are AI-accessible from the point of ingestion, not retroactively
  • Design data products that serve multiple consumption patterns: direct SQL access for computational biologists, structured feeds for ML training pipelines, and semantic interfaces for LLM-powered tools
Qualifications
  • Bachelors degree in Computer Science, Data Engineering, Bioinformatics, or a related field + 8 years data engineering experience OR Masters degree and 5 years data engineering experience
  • Demonstrated expertise in building data pipelines, ETL/ELT workflows, and data products that serve downstream AI/ML systems

Additional Skills/Preferences
  • Phd in data or related field
  • Strong SQL skills and experience with complex relational database schemas (hundreds of tables, multi-level joins, domain-specific conventions)
  • Experience with modern data platform technologies, including at least one of: Databricks, Snowflake, or equivalent lakehouse platforms
  • Experience with modern data engineering tools: dbt, Spark, Airflow, or similar orchestration and transformation frameworks
  • Proficiency in Python for data processing, scripting, and pipeline development
  • Experience with cloud data platforms (AWS preferred: Redshift, Athena, Glue, S3, or similar)
  • Familiarity with at least one of: vector databases, embedding pipelines, or semantic layer tooling
  • Strong communication skills — you can work effectively with both engineers who think in schemas and scientists who think in biology
  • Experience with biomedical or scientific data: omics datasets (RNA-seq, proteomics, GWAS), clinical data, or laboratory information management systems
  • Experience in pharmaceutical, biotech, or life sciences environments
  • Familiarity with biomedical ontologies and controlled vocabularies (Gene Ontology, MeSH, ChEBI, HGNC) and their application to data integration
  • Experience building data products that serve AI/ML systems — feature stores, training datasets, evaluation benchmarks, or semantic annotations for text-to-SQL
  • Knowledge of data governance practices in regulated industries: data lineage, access controls, versioning, and auditability
  • Experience with knowledge graph technologies (Neo4j, Amazon Neptune, RDF/SPARQL) or graph-based data modeling
  • Deep experience with Databricks ecosystem: Unity Catalog for data governance, Delta Lake for ACID transactions, MLflow integration, and Databricks SQL for analytics workloads
  • Experience designing data architectures that bridge traditional bioinformatics workflows (Nextflow, R/Bioconductor) with modern lakehouse consumption patterns

Actual compensation will depend on a candidate’s education, experience, skills, and geographic location.  The anticipated wage for this position is

$166,500 - $266,200


Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly’s compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

About Eli Lilly

ICOS Corporation is a biotechnology company that engages in the discovery, development, and commercialization of therapeutic products. It is engaged in the commercialization of treatments for unmet medical conditions, such as benign prostatic hyperplasia, hypertension, pulmonary arterial hypertension, cancer, and inflammatory diseases. It is the developer of a treatment known as Cialis (tadalafil), a product for the treatment of erectile dysfunction through its joint venture with Eli Lilly and Company in North America and Europe. It is also engaged in contract manufacturing services for third parties. It is in a strategic alliance with Solvay Pharmaceuticals, Inc. ICOS Corporation was established in 1989, based in Bothell, Washington. It is currently operated by Eli Lilly and Company.

Eli Lilly Careers

Joining Eli Lilly offers an unparalleled opportunity to become part of a leading global team dedicated to creating a healthier future. As a company revered for its commitment to innovation and leadership in the pharmaceutical industry, Eli Lilly is where your professional journey can flourish. Work You’ll Do At Eli Lilly, we are passionate about transforming patient care and advancing medical innovation. Our team at Eli Lilly is at the forefront of developing groundbreaking solutions in healthcare. By joining us, you will collaborate with some of the brightest minds in the industry, using cutting-edge technology to make real-world impacts. Lead with Innovation and Leadership Eli Lilly stands out in the marketplace by integrating deep industry expertise with robust research and development efforts. We are looking for professionals who are eager to drive change and lead the way in developing therapeutic breakthroughs. Explore Job Opportunities and Growth Eli Lilly offers a variety of career paths, including full-time positions and internships, across multiple functions such as research, marketing, IT, and sales. Whether you are a seasoned professional or a recent graduate, Eli Lilly provides an environment that promotes career growth and learning opportunities. Our commitment to diversity and leadership training ensures that every employee can achieve their potential. Be Part of Our Team Our team at Eli Lilly is committed to excellence and driven by a mission to improve lives. Employees enjoy a supportive culture that values collaboration, creativity, and diversity. We believe that a diverse workforce fosters innovation and helps us better connect with the communities we serve. Benefits and Culture Eli Lilly is dedicated to supporting our employees, offering competitive benefits, wellness programs, and comprehensive health care. Our culture is built on a foundation of respect, integrity, and quality, making Eli Lilly not just a great place to work, but a community to grow with. Networking and Professional Development Eli Lilly encourages continuous professional development and networking. With access to various training programs and mentorship opportunities, employees can enhance their skills and advance their careers. Our leadership is committed to nurturing talent through effective training and development strategies. Join Our Team Discover the exciting job opportunities at Eli Lilly by exploring open positions that match your skills and interests. We are continuously hiring and looking for individuals who are passionate, innovative, and ready to contribute to our mission of making life better for people around the globe. Stay Connected Keep up to date with the latest at Eli Lilly by following our careers blog. Gain insights from industry leaders and get tips on everything from crafting the perfect resume to preparing for your interview. Eli Lilly is not just a company—it's a place where you can make a difference. Explore the positions available and find out how your talents can help change the world. SEARCH ELI LILLY JOBS Stay ahead in your career with Eli Lilly, where innovation, leadership, and a commitment to diversity and growth lead the way to future advancements.
Learn more about Eli Lilly
Size
35,000 employees
Market Cap
$344.2 billion
Industry
Net Income
$6.1 billion
Founded
1876
5 Year Trend
+5.9%
Revenue
$24.5 billion
NASDAQ

Similar Jobs

More Jobs at Eli Lilly

More Pharmaceuticals & Biotech Jobs

Find similar Advisor - Scientific Data Engineer jobs: