xScion is seeking an Data Lead: Architecture, Ingestion & Platform to work on an xScion project team with one of our clients. The candidate leads the end-to-end Data Hub solution architecture and ingestion strategy, translating OEB's goals (resilience, analytics, vendor independence) into a scalable, performant, and testable cloud data platform across AWS, Amazon Redshift, and Databricks.
As a Data Lead, You Will:
- Translate RFQ scope and OEB objectives into a solution blueprint spanning ingestion, storage, access, and analytics for DB/DC/Health benefits data
- Design the secure, cloud-based Data Hub architecture (AWS, Databricks, Redshift, Glue, cataloging) optimized for performance, scalability, data quality, and resilience
- Produce logical and physical data models for structured/semi-structured sources from third-party recordkeeper and internal systems, ensuring IV&V testability
- Define standards for file formats, schemas, metadata, data versioning, and reconciliation patterns supporting analytics, IV&V, and audit readiness
- Design and prototype ingestion pipelines, transformation jobs, and storage layouts aligned to architecture and IV&V requirements
- Advise on query optimization, clustering/partitioning, workload management, cost controls, and monitoring for DB/DC/Health datasets in Redshift and Databricks
- Support troubleshooting of data-quality or performance issues identified during IV&V and testing
- Partner with Governance, Security, and IV&V leads to ensure delivery within the June-November term and support the November 2026 go-live goal.
To Be Successful, You Need:
- Permanent Residency or US Citizenship
- Bachelor's degree in Computer Science, Information Systems, Business IT Management or equivalent practical experience
- 8+ years in data architecture or data engineering with at least 3 years in cloud data platform roles
- Hands-on AWS data services experience: S3, Glue, Glue Data Catalog, Redshift, Athena, or equivalent
- Databricks experience required - Unity Catalog, Delta Lake, and ingestion pipeline design strongly preferred
- Proficiency designing RBAC and ABAC security models for sensitive or regulated data environments
- Strong data modeling skills (logical/physical), schema design, and metadata management
- Experience ingesting structured and semi-structured data from third-party or external recordkeeper systems
- Ability to produce architecture documentation and standards that can be reviewed and validated by an independent IV&V team.
Nice to Have:
- Prior experience with HIPAA-regulated health data or defined benefit/defined contribution plan data
- Familiarity with Federal Reserve, OCC, FDIC, or equivalent financial institution security and data standards
- Experience with data versioning, lineage tooling, and record storage management at scale
- AWS Certified Data Engineer, Solutions Architect, or Databricks certifications.
Great Benefits: Medical, dental, 401(k) match, flexible spending and more, but we also have unique perks such as up to 27 days off a year (including your birthday!), remote work opportunities, parental leave, wellness benefits and many other things that inspire balance and flexibility.