5+ years of experience building and supporting data pipelines in production environments
Strong experience with SQL and data modeling concepts
Hands-on experience with ETL/ELT frameworks and orchestration tools
Experience working with cloud platforms (Azure, AWS, or GCP)
Proficiency with data processing tools like Azure Data Factory, Databricks, Spark, or Airflow
Experience integrating data from diverse sources including APIs and relational databases
Strong understanding of data quality, lineage, and pipeline reliability
Responsibilities
Design, develop, and maintain end-to-end data pipelines for structured and unstructured data
Build robust ETL / ELT workflows to ingest data from multiple sources
Implement data transformations, validations, and quality checks
Optimize pipeline performance, scalability, and cost efficiency
Work closely with data analysts and product teams to support analytics and AI
Ensure data pipelines comply with security and privacy requirements
Monitor pipelines, troubleshoot failures, and implement recovery mechanisms
Benefits
Flexible work location (remote from the US, India, or globally)
Contract or full-time employment options
Exposure to healthcare, life sciences, and AI projects
Opportunity to work with advanced cloud data technologies
Involvement in cross-functional collaboration with data analytics and product teams
Possibility for growth in a dynamic and regulated data environment
Full Job Description
Data Pipeline Engineer Location: Remote (US / India / Global - based on project needs) Employment Type: Contract / Full-Time Industry: Healthcare, Life Sciences, Data & AI Experience: 5-10+ years (flexible based on strength)
Role Overview
We are seeking a Data Pipeline Engineer to design, build, and maintain scalable, reliable, and secure data pipelines supporting analytics, reporting, and AI/ML initiatives. This role focuses on ingesting, transforming, and delivering high-quality data across cloud platforms, with a strong emphasis on healthcare and regulated data environments.
Key Responsibilities
Design, develop, and maintain end-to-end data pipelines (batch and streaming) for structured and unstructured data
Build robust ETL / ELT workflows to ingest data from multiple sources including APIs, databases, files, and third-party systems
Implement data transformations, validations, and quality checks to ensure accuracy and reliability
Optimize pipeline performance, scalability, and cost efficiency
Work closely with data analysts, BI engineers, data scientists, and product teams to support downstream analytics and AI use cases
Ensure data pipelines comply with security, privacy, and HIPAA requirements where applicable
Monitor pipelines, troubleshoot failures, and implement alerting and recovery mechanisms
Contribute to data architecture decisions, documentation, and best practices
Required Qualifications
5+ years of experience building and supporting data pipelines in production environments
Strong experience with SQL and data modeling concepts
Hands-on experience with ETL/ELT frameworks and orchestration tools
Experience working with cloud platforms (Azure, AWS, or GCP)
Proficiency with data processing tools such as Azure Data Factory, Databricks, Spark, Airflow, or similar
Experience integrating data from APIs, flat files, relational databases, and cloud storage
Strong understanding of data quality, lineage, and pipeline reliability