Eli Lilly

Data Architect, Data Foundry

Eli Lilly$132K — $193K *
Pharmaceuticals & Biotech
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • B.S. or M.S. in Computer Science, Data Science, or related STEM field; Ph.D. preferred for ontology and knowledge graph roles.
  • 7+ years of experience for B.S. or 5+ years for M.S. in data architecture, data engineering, or scientific informatics.
  • Proficiency in SQL and various database paradigms (relational, graph, document, etc.).
  • Must be authorized to work in the U.S. without sponsorship.
  • Experience with semantic web technologies and OLTP/OLAP principles.

Responsibilities

  • Design and implement data models, schemas, and ontologies for scientific data.
  • Define metadata standards and FAIR-compliant frameworks in partnership with data governance teams.
  • Create interoperable scientific data using semantic standards like RDF and OWL.
  • Architect lakehouse data solutions on platforms like Databricks or Snowflake.
  • Optimize ETL/ELT processes using tools such as Spark and dbt.
  • Develop knowledge graphs that analyze relationships in discovery data.
  • Collaborate with stakeholders to ensure data architectures are performant and compliant.

Benefits

  • Eligibility to participate in a company-sponsored 401(k) and pension plan.
  • Comprehensive medical, dental, vision, and prescription drug benefits.
  • Vacation and flexible spend account options for healthcare and dependent care.
  • Access to well-being benefits like fitness programs and employee assistance.
  • Opportunities for individual and company performance-based bonuses.
Full Job Description
Position: Data Architect, Data Foundry

Location: San Diego, CA; San Francisco, CA; Boston, MA; Louisville, CO; Indianapolis, IN

Overview

Lilly Small Molecule Discovery is purpose-built to create molecules that make life better for people. Discovery Technology and Platforms (DTP) accelerates molecule discovery by building optimized foundational platforms, streamlining lab operations through advanced technologies and data connectivity, and investing in novel capabilities.

Data Foundry is a multidisciplinary team within DTP that enables AI-native drug discovery through four integrated pillars: Architecture4Insight (data infrastructure and scientific software), Methods4Insight (analytical and computational methods), Automation & Scale4Insight (lab automation and agentic workflows), and Preparedness4Insight (data governance and readiness). These pillars empower every Lilly scientist to make optimal decisions by providing seamless access to data, insights, and AI-driven capabilities-serving both human scientists and autonomous AI agents.

Position Summary

We are seeking Data Architects at multiple levels to design and build the data infrastructure that makes AI-native drug discovery possible. You will create the schemas, ontologies, data models, knowledge graphs, and platform architectures that transform raw scientific data into machine-actionable, FAIR-compliant, insight-ready assets-serving both discovery scientists and autonomous AI agents.

This role is the foundation of Architecture4Insight. Everything the software engineering team builds-pipelines, APIs, prototypes-depends on the data models and platform architecture this team designs. You will work with deep knowledge of scientific data (chemical, biological, HTE, automation-generated) to create custom-fit solutions, then partner with to scale and maintain them. The role spans three focus areas depending on expertise: data modeling & ontologies, data platform & lakehouse architecture, and knowledge graph & specialized data systems.

Responsibilities

Data Modeling & Ontologies
  • Design and implement data models, schemas, and ontologies for chemical, biological, and automation-generated data that serve discovery workflows across the portfolio.
  • Define and maintain controlled vocabularies, metadata standards, and FAIR-compliant data frameworks in partnership with Preparedness4Insight.
  • Implement semantic data standards (RDF, OWL, SPARQL) and ontology engineering practices to create interoperable, machine-readable scientific data.

Data Platform & Lakehouse Architecture
  • Design and implement data lakehouse architecture using modern platforms (Databricks, Snowflake, or equivalent), including data storage patterns, partitioning strategies, and query optimization.
  • Build and optimize ETL/ELT pipelines using Spark, dbt, or similar tools to transform raw scientific data into analytical and ML-ready formats.
  • Implement real-time and streaming data integration (Kafka, Kinesis, event-driven patterns) connecting LIMS, instruments, and lab automation systems to the data infrastructure.

Knowledge Graph & Specialized Data Systems
  • Design and implement knowledge graphs (Neo4j, Amazon Neptune, TigerGraph) that capture molecular, target, pathway, and experimental relationships across the discovery landscape.
  • Architect specialized data solutions: array databases (TileDB) for genomics/imaging, document stores (MongoDB) for experimental records, and vector databases for embedding-based retrieval supporting ML and RAG workflows.
  • Build query and traversal patterns that enable scientists and AI agents to ask relational questions across the entire data landscape.

Cross-Functional Partnership
  • Partner with scientific software engineers to ensure data architectures are implementable, performant, and well-documented.
  • Collaborate with Methods4Insight to design data structures that support analytical model training, deployment, and evaluation.
  • Work with Tech@Lilly to define scaling strategies, ensure enterprise compliance, and transition data architectures to production-grade management.
  • Contribute to build-versus-buy-versus-adopt decisions by evaluating commercial and open-source data platforms against Data Foundry requirements.

Basic Requirements
  • B.S. or M.S. in Computer Science, Data Science, Bioinformatics, Computational Biology, Information Science, or related STEM field; Ph.D. valued for ontology and knowledge graph roles.
  • B.S. with 7+ years and M.S. with 5+ years of data architecture, data engineering, or scientific informatics' experience.
  • SQL skills and experience in multiple database paradigms (relational, graph, document, columnar, key-value).
  • Qualified applicants must be authorized to work in the United States on a full-time basis. Lilly will not provide support for or sponsor work authorization or visas for this role, including but not limited to F-1 CPT, F-1 OPT, F-1 STEM OPT, J-1, H-1B, TN, O-1, E-3, H-1B1, or L-1.

Preferred Qualifications
  • Expertise in at least one of: data modeling/ontologies, data platform engineering (Databricks, Snowflake, Spark), or graph/specialized databases (Neo4j, Neptune, MongoDB).
  • Familiarity with cloud platforms (AWS, Azure, or GCP) and modern data integration patterns.
  • Understanding of scientific data types and experimental workflows in life sciences or pharma (chemical, biological, HTE data).
  • Strong communication skills with ability to translate data architecture concepts for both technical and scientific audiences.
  • Pharmaceutical or biotech research industry experience, particularly in discovery data management or research informatics.
  • Experience with semantic web technologies: RDF, OWL, SPARQL, Protégé, or equivalent ontology engineering tools.
  • Hands-on experience with graph databases (Neo4j, Neptune, TigerGraph) and knowledge graph design patterns for scientific data.
  • Data lakehouse architecture experience: Databricks (Delta Lake, Unity Catalog), Snowflake, or equivalent; ETL/ELT with Spark, dbt.
  • Experience with streaming/real-time data platforms (Kafka, Kinesis, Flink) and event-driven architectures.
  • Familiarity with LIMS, ELN systems (e.g., Benchling), and laboratory instrument data integration.
  • Experience with vector databases (Pinecone, Weaviate, pgvector) and embedding-based retrieval for ML/RAG applications.
  • Array database experience (TileDB, Zarr) for genomics, imaging, or high-dimensional scientific data.
  • Experience with bioinformatics data formats (FASTA, BAM/CRAM, VCF) and biological sequence databases; familiarity with NGS data pipelines and proteomics data management.
  • FAIR data principles implementation experience and Data Readiness Level frameworks.
  • Scientific data standards and controlled vocabularies in chemistry (InChI, SMILES) or biology (Gene Ontology, UniProt, pathway databases such as Reactome or KEGG).

Actual compensation will depend on a candidate's education, experience, skills, and geographic location. The anticipated wage for this position is
$132,000 - $193,600

Full-time equivalent employees also will be eligible for a company bonus (depending, in part, on company and individual performance). In addition, Lilly offers a comprehensive benefit program to eligible employees, including eligibility to participate in a company-sponsored 401(k); pension; vacation benefits; eligibility for medical, dental, vision and prescription drug benefits; flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts); life insurance and death benefits; certain time off and leave of absence benefits; and well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities).Lilly reserves the right to amend, modify, or terminate its compensation and benefit programs in its sole discretion and Lilly's compensation practices and guidelines will apply regarding the details of any promotion or transfer of Lilly employees.

#WeAreLilly

About Eli Lilly

ICOS Corporation is a biotechnology company that engages in the discovery, development, and commercialization of therapeutic products. It is engaged in the commercialization of treatments for unmet medical conditions, such as benign prostatic hyperplasia, hypertension, pulmonary arterial hypertension, cancer, and inflammatory diseases. It is the developer of a treatment known as Cialis (tadalafil), a product for the treatment of erectile dysfunction through its joint venture with Eli Lilly and Company in North America and Europe. It is also engaged in contract manufacturing services for third parties. It is in a strategic alliance with Solvay Pharmaceuticals, Inc. ICOS Corporation was established in 1989, based in Bothell, Washington. It is currently operated by Eli Lilly and Company.

Eli Lilly Careers

Joining Eli Lilly offers an unparalleled opportunity to become part of a leading global team dedicated to creating a healthier future. As a company revered for its commitment to innovation and leadership in the pharmaceutical industry, Eli Lilly is where your professional journey can flourish. Work You’ll Do At Eli Lilly, we are passionate about transforming patient care and advancing medical innovation. Our team at Eli Lilly is at the forefront of developing groundbreaking solutions in healthcare. By joining us, you will collaborate with some of the brightest minds in the industry, using cutting-edge technology to make real-world impacts. Lead with Innovation and Leadership Eli Lilly stands out in the marketplace by integrating deep industry expertise with robust research and development efforts. We are looking for professionals who are eager to drive change and lead the way in developing therapeutic breakthroughs. Explore Job Opportunities and Growth Eli Lilly offers a variety of career paths, including full-time positions and internships, across multiple functions such as research, marketing, IT, and sales. Whether you are a seasoned professional or a recent graduate, Eli Lilly provides an environment that promotes career growth and learning opportunities. Our commitment to diversity and leadership training ensures that every employee can achieve their potential. Be Part of Our Team Our team at Eli Lilly is committed to excellence and driven by a mission to improve lives. Employees enjoy a supportive culture that values collaboration, creativity, and diversity. We believe that a diverse workforce fosters innovation and helps us better connect with the communities we serve. Benefits and Culture Eli Lilly is dedicated to supporting our employees, offering competitive benefits, wellness programs, and comprehensive health care. Our culture is built on a foundation of respect, integrity, and quality, making Eli Lilly not just a great place to work, but a community to grow with. Networking and Professional Development Eli Lilly encourages continuous professional development and networking. With access to various training programs and mentorship opportunities, employees can enhance their skills and advance their careers. Our leadership is committed to nurturing talent through effective training and development strategies. Join Our Team Discover the exciting job opportunities at Eli Lilly by exploring open positions that match your skills and interests. We are continuously hiring and looking for individuals who are passionate, innovative, and ready to contribute to our mission of making life better for people around the globe. Stay Connected Keep up to date with the latest at Eli Lilly by following our careers blog. Gain insights from industry leaders and get tips on everything from crafting the perfect resume to preparing for your interview. Eli Lilly is not just a company—it's a place where you can make a difference. Explore the positions available and find out how your talents can help change the world. SEARCH ELI LILLY JOBS Stay ahead in your career with Eli Lilly, where innovation, leadership, and a commitment to diversity and growth lead the way to future advancements.
Learn more about Eli Lilly
Size
35,000 employees
Market Cap
$344.2 billion
Industry
Net Income
$6.1 billion
Founded
1876
5 Year Trend
+5.9%
Revenue
$24.5 billion
NASDAQ

Similar Jobs

More Jobs at Eli Lilly

More Pharmaceuticals & Biotech Jobs

Find similar Data Architect, Data Foundry jobs: