Data Engineer (AI/ML)

Blue Cross and Blue Shield Association • $100K — $138K *

Chicago, IL 60629In-Person

Healthcare

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field.
5+ years of experience in data engineering, particularly in cloud-based environments.
Expertise in machine learning and generative AI data pipelines.
Proficiency in Python, SQL, and distributed data frameworks (PySpark, Databricks, AWS Glue, EMR).
Hands-on experience with AWS AI/ML services and workflow orchestration (Airflow) and containerization (Kubernetes).

Responsibilities

Design, build, and maintain high-performance data pipelines for healthcare data.
Collaborate with Data Architects and Data Scientists to develop scalable engineering solutions.
Implement data validation frameworks for pipeline accuracy and completeness.
Participate in code reviews and promote continuous improvement within the team.
Monitor and optimize cloud-based data systems for performance and cost efficiency.

Benefits

Paid time off and 11 holidays.
Medical, dental, and vision insurance.
Generous 401(k) matching.
Lifestyle spending account with various benefits.
Comprehensive package of benefits for eligible employees.

Full Job Description

Job Description Summary

The Data Engineer will design, build, and optimize scalable, secure data pipelines that power analytics and product platforms. For this role specifically, the focus will be on Machine Learning (ML) and Generative Artificial Intelligence (GenAI) workloads, while contributing to innovation and ensuring compliance with healthcare industry standards. This role is expected to provide strong hands-on technical expertise, collaborate across teams, and contribute to architecture decisions that align engineering practices with organizational goals.

Job Description

Design, build, andmaintainreliable, high-performance data pipelines for large-scalestructured and unstructured healthcare data.

UsePySparkand modern cloud-based tools (Databricks, AWS Glue, EMR, Snowflake) to transform and process data efficiently.

Support ingestion, transformation, and validation processes that ensure data consistency, integrity, and availability.

Partner with Data Architects, Data Scientists, and Analysts to translate business needs into scalable engineering solutions.

Collaborate with platform and DevOps teams to deploy, scale, and monitor data pipelines using Airflow and Kubernetes.

Participate in code reviews, documentation, and continuous improvement efforts across the engineering team.

Implement andmaintaindata validation frameworks to ensure pipeline accuracy and completeness.

Contribute to best practices in version control, metadata management, and reproducibility.

Stay current with emerging technologies in data engineering and cloud computing, recommending improvements to existing infrastructure.

Participate in performance tuning, cost optimization, and scaling strategies for cloud-based data systems.

Identifyautomation opportunities to streamline ETL/ELT processes and reduce operational overhead.

Share knowledge and mentor junior team members on tools, techniques, and best practices.

Promote a culture of collaboration, innovation, and continuous learning within the engineering organization.

Support compliance with SOC 2, HIPAA, and GDPR by adhering toestablisheddata privacy and security practices.

The posting range for this position is:

100,800.00 - 138,600.00

Required Education, Certifications and Experience:

Education:

Bachelor’s orMaster’s degree in Computer Science, Engineering, or related field.

Experience:

5+years of experience in data engineering, including building and managing pipelines in cloud-based environments.

Knowledge Skills and Abilities

Experience with building and operationalizing the data foundations that support machine learning and generative AI use cases, including feature pipelines, training/inference data preparation, and retrieval-ready datasets (e.g., embeddings and vector stores)
Familiarity with GenAI skills and adjacent tooling (foundation models, prompt engineering, RAG, embeddings/vector databases, and GenAI orchestration frameworks).
Hands-on experience with AWS AI/ML and data services, including Amazon Bedrock, Bedrock Agent Core, SageMaker, Glue, and EMR.
Experience designing and optimizing data architectures, including data foundations that support ML and GenAI workloads.

Hands-on experience with workflow orchestration (Airflow) and containerization (Kubernetes).

Hands-on technical expertise, cross-team collaboration, and contributing to architecture decisions

Proficiencyin Python, SQL, and distributed data frameworks (PySpark, Databricks, AWS Glue, EMR).
Working knowledge of cloud platforms (AWS or Azure) and data warehouses (Snowflake).

Familiarity with NoSQL and relational databases, as well as data modeling best practices.

Strong analytical, problem-solving, and communication skills.

Understanding ofcompliance frameworks (SOC 2, HIPAA) and secure data management principles.
Experience working with healthcare datasets or knowledge of healthcare standards (HIPAA, HL7, FHIR) preferred.

#LI_HYBRID

The posted salary range is the lowest to highest salary we, in good faith, believe we would pay for this role at the time of this posting. We may ultimately pay more or less than the hiring range andthis hiring range may also be modified in the future. A candidate’s position within the hiring range may be based on several factors including, but not limited to, specific competencies, relevant education, qualifications, certifications, relevant experience, skills, seniority, performance, shift, travel requirements, and business or organizational needs.This job is also eligible for annual bonusincentive pay.

We offer a comprehensive package of benefits including paid time off, 11 holidays,medical/dental/vision insurance, generous 401(k) matching, lifestyle spending account and many other benefits to eligible employees.

About Blue Cross and Blue Shield Association

The Blue Cross Blue Shield Association (BCBSA) is a federation of 36 separate United States health insurance companies that provide health insurance in the United States to more than 106 million people. It was formed in 1982 from the merger of its two namesake organizations: Blue Cross was founded in 1929 and became the Blue Cross Association in 1960, while Blue Shield emerged in 1939 and the Blue Shield Association was created in 1948. The Blue Cross Blue Shield Association is headquartered in Chicago and has offices in Washington, D.C. The association provides health insurance products and services to more than 106 million Americans.

Learn more about Blue Cross and Blue Shield Association

Size

1,000 employees

Industry

Healthcare

Founded

1929

* Ladders Estimates

Similar Jobs

Software Engineer/Data Science Librarian
$107K — $195K *
Leidos Holding
Arnold, MO 63010 (Jefferson County)
Today
Azure Data Engineer
$90K — $120K *
HTC Global Services
Grand Rapids, MI 49504 (Kent County)
Today
Data Engineer II
$90K — $120K *
MediQuant LLC
Remote
Today
Data Engineer
$74K — $133K *
Covista
Lisle, IL 60532 (Dupage County)
Today
Software Engineer II - Data Platform
$103K — $129K *
Pantheon Systems, Inc
Remote
Reposted Today
Senior Data Manager
$97K — $133K *
CenterWell Primary Care
Remote
Today

Get Ready For Your
Next Interview

More Jobs at Blue Cross and Blue Shield Association

Event Senior Consultant
$92K — $122K *
Chicago, IL 60629 (Cook County)
2 weeks ago
Hospitality & Recreation
In-Person
Legislative and Regulatory Senior Policy Program Manager
$111K — $160K *
Washington, DC 20011 (District Of Columbia County)
4 weeks ago
Healthcare
In-Person
Product Strategy Lead (Low Value Care portfolio)
$129K — $178K *
Chicago, IL 60629 (Cook County)
1 month ago
Healthcare
In-Person
Privacy Lead Consultant
$130K — $187K *
Chicago, IL 60629 (Cook County)
1 month ago
Healthcare
In-Person
Privacy Lead Consultant
$130K — $187K *
Washington, DC 20011 (District Of Columbia County)
1 month ago
Healthcare
In-Person

More Healthcare Jobs

Executive Director, Facilities Operations
$150K — $170K *
The Vernon Staffing Group
Cleveland, OH 44106 (Cuyahoga County)
Reposted 4 days ago
Licensed Therapist
Small Joys
Remote
Reposted 6 days ago
Executive Director Center for Donation and Transplant
$146K — $234K *
Albany Medical Center
Albany, NY 12203 (Albany County)
Today
HIM Manager
$94K — $147K *
Albany Medical Center
Albany, NY 12208 (Albany County)
Today
Clinical Lab Technologist II - Hematology
$64K — $97K *
Albany Medical Center
Albany, NY 12208 (Albany County)
Today

Find similar Data Engineer (AI/ML) jobs:

Nationwide Chicago, IL

Data Engineer (AI/ML)

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Data Engineer (AI/ML) jobs:

Get Ready For Your
Next Interview