Position ProfileThe Senior Data & AI Engineer will need to have deep hands'on experience in Snowflake, Microsoft Fabric (incl. OneLake), and healthcare data ecosystems. The ideal candidate understands data modeling, data integration, and data transformation across structured and unstructured sources, and can build machine learning pipelines that operate on claims and clinical data. You'll design secure, scalable data platforms; map and normalize data across payers, providers, CMS datasets, EHR systems, and HIEs; and operationalize AI tools to drive measurable outcomes in cost, quality, and member/patient experience.
ResponsibilitiesData Platform Engineering
- Architect, implement, and optimize data solutions in Snowflake and Microsoft Fabric (incl. OneLake, Lakehouses, Warehouses, and Data Engineering pipelines).
- Build robust ingestion frameworks for batch and streaming data (e.g., ADLS, EventHub, APIs, SFTP) with lineage and governance.
- Manage data security, privacy, and compliance (HIPAA/PHI; role'based access, masking, tokenization, de'identification).
Data Modeling & Integration
- Design conceptual/logical/physical models (normalized, dimensional/star, data vault where appropriate).
- Implement data mapping and transformations for structured (claims, eligibility, provider, enrollment) and unstructured (clinical notes, PDFs) data.
- Harmonize healthcare data using FHIR/HL7/C'CDA, X12/EDI 837/835, NCPDP, and CMS standards; reconcile and link records across EHR and HIE sources.
Analytics & Machine Learning
- Build ML pipelines for risk stratification, cost/utilization forecasting, fraud/waste/abuse detection, quality measure computation (e.g., HEDIS), and care gap identification.
- Operationalize models with MLOps (experiment tracking, reproducibility, CI/CD, monitoring, drift detection).
- Leverage LLMs/AI tools for data quality, entity resolution, summarization, and clinical insights-ensuring safety, bias checks, and auditability.
Governance & Observability
- Implement data cataloging, lineage, and metadata (e.g., Microsoft Purview or equivalent).
- Establish quality SLAs, validation rules, profiling, and automated anomaly detection.
- Instrument pipelines for cost, performance, and reliability (e.g., Snowflake resource monitors, Fabric capacities).
Collaboration & Delivery
- Work with product owners, clinicians, actuaries, and analytics teams to translate requirements into scalable solutions.
- Produce clear documentation, data dictionaries, and mapping specs; mentor engineers and analysts.
- Contribute to architectural roadmaps, reference patterns, and best practices across the enterprise.
Maintains and enhances professional skills.
Adheres to high standards of personal and professional conduct.
Minimum Qualifications- 8+ years in data engineering/analytics; 5+ years hands'on with Snowflake (compute, storage, virtual warehouses, tasks, streams, Snowpipe, Time Travel, RBAC, row/column masking, data sharing, Dynamic Tables).
- 2+ years with Microsoft Fabric (including OneLake, Lakehouses, Warehouses, Dataflows Gen2, Notebooks, Pipelines; capacity management).
- Strong data modeling expertise (dimensional/star, 3NF, data vault; surrogate keys, SCD types, conformed dimensions).
- Data integration & transformation proficiency: SQL (advanced), dbt or Fabric Dataflows/Power Query M, ADF/Synapse/Fabric Pipelines, Python for ETL/ELT.
- Mapping from CMS data (e.g., Medicare datasets, claims/encounters), X12/EDI, FHIR/HL7, provider and eligibility.
- Experience with structured (tables, CSV, Parquet) and unstructured (clinical notes, PDFs, blobs) data; NLP pipelines (optional but valued).
- Machine learning: feature engineering, model training/evaluation, and deployment (e.g., scikit'learn, PyTorch/TensorFlow, Fabric ML/Notebook, Azure ML); production monitoring.
- Security & compliance: HIPAA, PHI handling, auditing, data residency, BAAs; practical access control in Snowflake/Fabric.
- Strong communication; ability to author mapping specs, lineage docs, and present trade'offs to technical and non'technical stakeholders.
Preferred Qualifications- Interoperability: FHIR R4, HL7 v2, X12/EDI (837/835), NCPDP; experience with HIEs and EHR integrations (Epic, Cerner, etc.).
- CMS & payer/provider data: Medicare fee'for'service, MA, Medicaid, CCW, APCD, and quality programs; risk adjustment (HCC), HEDIS measures.
- MLOps & DevOps: MLflow, DVC, GitHub Actions/Azure DevOps, containerization (Docker), orchestration (Airflow, Fabric Pipelines, or ADF).
- Governance: Microsoft Purview (catalog, lineage, classifications), data quality tools.
- Visualization: Power BI and Fabric Direct Lake; semantic modeling and row'level security.
- Cloud: Azure (ADLS, Event Hub, Functions, Key Vault, Databricks), optional AWS/GCP exposure.
- Certifications: Snowflake SnowPro Core/Advanced, Microsoft Certified (Azure Data Engineer Associate, Fabric Analytics Engineer).