IntegriChain

AI Data Engineer

IntegriChain$100K — $130K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 6+ years in data engineering or related fields, with production experience
  • Strong proficiency in Snowflake including SQL development and performance tuning
  • Hands-on experience with dbt or similar ELT tools
  • Proven track record in building data pipelines from raw to business-ready datasets
  • Expertise in creating semantic layers and analytical views
  • Strong understanding of ER and dimensional modeling techniques
  • Ability to translate data dictionaries into physical and semantic models

Responsibilities

  • Design and optimize Snowflake pipelines and dbt models for diverse data layers
  • Develop robust testing and documentation for dbt models and transformations
  • Implement data quality checks and incremental processing for pipelines
  • Create semantic models to enhance the accuracy of AI tools
  • Collaborate across teams to integrate various data sources into cohesive models
  • Support the alignment of datasets with enterprise definitions and consolidate data effectively

Benefits

  • Mission-driven work aimed at improving patient lives
  • Comprehensive medical benefits and unique perks such as Student Loan Reimbursement
  • Flexible Paid Time Off and Paid Parental Leave
  • 401(k) plan with company matching
  • Extensive learning resources with 700+ development courses available to all employees
Full Job Description


This role offers flexibility, but candidates must reside in Pennsylvania, New Jersey, or New York and be within a reasonable travel distance of our Philadelphia office, as regular in-person collaboration is required.

Job Description

Mission

Join the Data Science team as an AI Data Engineer responsible for building the data foundations that make enterprise AI products accurate, explainable, and scalable. This role will design and implement Snowflake and dbt pipelines from raw source data to curated gold-layer datasets, create semantic models that LLM tools can use reliably, and partner with data science, product, and engineering teams to convert data dictionaries and business definitions into AI-ready data products. The ideal candidate is a strong data engineer with deep Snowflake/dbt experience and a practical understanding of how semantic layers, ER relationships, denormalized models, and metadata quality influence LLM and agent performance.

Position Overview
  • Snowflake and dbt engineering: Design, build, optimize, and operate Snowflake pipelines and dbt models across raw, curated, and gold-layer datasets.
  • AI-ready semantic modeling: Create semantic models, relationships, metrics, dimensions, and curated views that allow LLM tools and agents to answer questions accurately.
  • Data dictionary-driven delivery: Translate team-defined data dictionaries, business definitions, and source mappings into tested, governed, and reusable data products.
  • Agent consumption focus: Design datasets for AI agents, natural-language analytics, Snowflake Cortex Analyst, and other LLM-powered tools.
  • Enterprise data modeling: Balance normalized source models, ER relationships, dimensional models, denormalized consumption layers, and semantic-layer needs.

Key Responsibilities

Snowflake, dbt, and Data Pipeline Development
  • Build reliable data pipelines from raw source data through curated silver layers and business-ready gold layers using Snowflake and dbt.
  • Develop modular dbt models, tests, documentation, exposures, and lineage-friendly transformation patterns.
  • Implement incremental processing, snapshots, audit columns, reconciliation, data quality checks, and restartable pipeline patterns.
  • Optimize Snowflake SQL and dbt workloads for performance, scalability, cost, and maintainability.
  • Work with orchestration and DevOps/SRE teams to support CI/CD, environment promotion, pipeline monitoring, and operational runbooks.

Semantic Models and AI-Ready Data Products
  • Create Snowflake semantic models and curated views that support accurate natural-language querying through Snowflake Cortex Analyst and related LLM tools.
  • Translate approved data dictionaries into semantic model dimensions, facts, metrics, synonyms, descriptions, relationships, and business rules.
  • Design ER relationships and join paths that are explicit, accurate, and easy for semantic-layer tools and AI agents to use.
  • Create denormalized or consumption-optimized models where appropriate to reduce ambiguity and improve LLM answer quality.
  • Partner with AI developers to understand tool schema needs, agent workflows, and how data model design affects LLM tool performance.

Data Modeling, Integration, and Consolidation
  • Design logical and physical models that support enterprise data consolidation, analytical reporting, AI workflows, and business operations.
  • Work across source systems, files, APIs, cloud storage, operational systems, and analytical platforms to integrate data into Snowflake.
  • Create reusable patterns for source-to-target mapping, schema evolution, master/reference data alignment, and data product publishing.
  • Collaborate with business and technical stakeholders to validate data definitions, grain, relationships, hierarchies, and measures.
  • Support data consolidation across Integrichain by rationalizing overlapping datasets and aligning enterprise definitions.

Snowflake Cortex and AI Platform Enablement
  • Understand Snowflake Cortex capabilities, including Cortex Analyst, Cortex Complete, semantic views/models, and metadata-driven AI workflows.
  • Prepare data models and semantic layers for accurate LLM usage, including clear naming, descriptions, relationships, metrics, and governance metadata.
  • Support AI Explorer and similar applications by ensuring curated datasets are reliable, performant, explainable, and governed.
  • Partner with AI and application teams to troubleshoot semantic model issues, poor AI answers, ambiguous joins, missing metadata, or incorrect measures.
  • Contribute to standards for AI-ready data design, semantic model review, data dictionary alignment, and LLM-friendly data modeling.


Qualifications
  • 6+ years of experience in data engineering, analytics engineering, database engineering, or data platform development in production environments.
  • Strong hands-on experience with Snowflake, including SQL development, performance tuning, security-aware design, cost optimization, and large-volume processing.
  • Strong hands-on experience with dbt or comparable ELT tooling, including models, tests, documentation, lineage, and environment promotion.
  • Experience building raw-to-curated-to-gold data pipelines and business-ready datasets.
  • Strong SQL and Snowflake development skills, including complex transformations, views, stored procedures/Snowflake Scripting, and query optimization.
  • Experience creating semantic layers, semantic models, metrics, dimensions, relationships, and curated analytical views.
  • Good understanding of ER modeling, dimensional modeling, denormalized consumption models, and data grain management.
  • Experience translating data dictionaries and business definitions into physical models, dbt models, and semantic-layer definitions.
  • Understanding of Snowflake Cortex capabilities such as Cortex Analyst, Cortex Complete, and semantic-model-driven natural-language querying.
  • Ability to partner with data science, product, engineering, and business teams to deliver AI-ready data products.

Preferred Experience
  • Experience in life sciences, healthcare, pharma commercialization, MDM, patient data, channel data, or commercial data platforms.
  • Experience with Snowflake semantic views, Cortex Analyst, Cortex Search, or other AI/LLM data platform capabilities.
  • Experience with data quality frameworks, metadata management, data observability, and lineage tooling.
  • Experience with orchestration tools such as dbt Cloud jobs, Airflow, Dagster, cloud-native schedulers, or similar platforms.
  • Experience with Python for data automation, metadata processing, testing, or API integrations.
  • Experience designing governed data products for BI, AI/ML, natural-language analytics, or agentic applications.
  • Snowflake SnowPro, dbt certification, or equivalent data engineering credentials.


Additional Information

What does IntegriChain have to offer?
  • Mission driven: Work with the purpose of helping to improve patients' lives!
  • Excellent and affordable medical benefits + non-medical perks including Student Loan Reimbursement, Flexible Paid Time Off and Paid Parental Leave
  • 401(k) Plan with a Company Match to prepare for your future
  • Robust Learning & Development opportunities including over 700+ development courses free to all employees

#LI-ZG1

About IntegriChain

IntegriChain is a healthcare technology company that provides data analytics and insights to life sciences manufacturers. The company's platform helps manufacturers to optimize patient access to life-saving and life-changing medicines. IntegriChain's platform provides real-time data and insights to help manufacturers to make informed decisions about their supply chain, commercialization, and patient access strategies. The company's customers include many of the world's leading pharmaceutical manufacturers. IntegriChain was founded in 2006 and is headquartered in Malvern, Pennsylvania.
Learn more about IntegriChain
Size
300 employees
Industry
Net Income
-$5 million
Founded
2006
5 Year Trend
+20%
Revenue
$50 million

Similar Jobs

More Jobs at IntegriChain

More Information Technology Jobs

Find similar AI Data Engineer jobs: