AI Data Engineer

Giesecke+Devrient

$95K — $115K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 3+ years of hands-on experience in data engineering or related roles.
  • Proven ability to build production-grade data pipelines and ETL workflows.
  • Experience in preparing and validating data for machine learning projects.
  • Familiarity with RAG, document processing, and knowledge graph technologies is a plus.
  • Bachelor's degree in Computer Science, Software Engineering, or related field preferred.

Responsibilities

  • Design, build, and maintain data pipelines for AI and ML initiatives.
  • Develop document ingestion and processing pipelines for varied types of enterprise content.
  • Implement chunking strategies and retrieval-ready datasets for RAG applications.
  • Integrate vector databases, search indexes, and data lakes with existing systems.
  • Prepare data for machine learning including data cleaning and feature engineering.
  • Implement data quality checks and monitoring for ML pipelines.
  • Collaborate with cross-functional teams to deliver AI solutions.

Benefits

  • Opportunity to work in a cutting-edge AI Hub focused on Generative AI.
  • Support for professional development and continuous learning.
  • Collaborative environment with diverse teams and stakeholders.
  • Exposure to the latest technologies in data engineering and AI.
  • Challenging projects that drive innovation in machine learning.
Full Job Description
Compensation: $95,000-115,000 plus up to 5% bonus, capped at 150%

Job Summary

We are seeking a technical and execution-focused Data Engineer to join G+D's new AI Hub.

The ideal candidate will combine hands-on experience in data engineering for AI systems with strong Python, SQL, and data pipeline engineering capabilities. This role will support both AI engineering initiatives and machine learning projects by making enterprise data reliable, accessible, well-structured, and ready for production use.

This role is focused on data engineering for Generative AI, RAG, document ingestion, vector search, knowledge graphs, and machine learning workflows, including data preparation, data quality, feature engineering, and reusable data assets for AI solutions.

Primary Responsibilities
  • Design, build, and maintain data pipelines that support AI engineering, RAG, and machine learning initiatives from experimentation through production.
  • Develop document ingestion and processing pipelines for structured, semi-structured, and unstructured enterprise content, including parsing, cleaning, normalization, metadata extraction, and enrichment.
  • Implement chunking strategies, embedding pipelines, indexing workflows, and retrieval-ready datasets for RAG and Graph RAG applications.
  • Build and maintain integrations with vector databases, search indexes, graph databases, data lakes, warehouses, and enterprise source systems.
  • Support knowledge graph initiatives by preparing entities, relationships, ontologies, metadata, and graph-ready data pipelines.
  • Prepare and transform data for machine learning projects, including data cleaning, labeling support, feature engineering, feature validation, and dataset versioning.
  • Implement data quality checks, lineage, observability, monitoring, and automated validation for AI and ML data pipelines.
  • Collaborate with data scientists, applied AI engineers, platform engineers, security, data governance teams, and business stakeholders to deliver scalable AI solutions.
  • Contribute to reusable ingestion components, data engineering patterns, technical standards, and best practices for the AI Hub.
  • Other duties as assigned.


Qualifications, Experience and Educational Requirements

Work Experience:
  • Three (3)+ years of hands-on experience in data engineering, analytics engineering, machine learning engineering, or related software/data development roles.
  • Experience building production-grade data pipelines, ETL/ELT workflows, APIs, data services, or distributed data processing systems.
  • Experience preparing data for machine learning projects, including data cleaning, feature engineering, dataset creation, and data quality validation.
  • Experience with RAG, document processing, embeddings, vector databases, search systems, or knowledge graphs is strongly preferred.
  • Experience contributing to production-grade systems in enterprise, regulated, or security-sensitive environments is preferred.

Skills and Competencies:
  • Strong Python and SQL skills, with practical experience building reliable, maintainable, and testable data pipelines.
  • Hands-on experience with data engineering tools and frameworks such as Pandas, PySpark, Airflow, Dagster, Prefect, dbt, or similar technologies.
  • Practical knowledge of document ingestion, document parsing, chunking, embeddings, semantic search, hybrid search, and retrieval pipelines.
  • Hands-on experience with vector databases and search technologies such as pgvector, Pinecone, Weaviate, Milvus, OpenSearch, Elasticsearch, or similar platforms.
  • Hands-on experience with graph databases or knowledge graph technologies such as Neo4j, RDF, SPARQL, graph data modeling, or entity-relationship extraction is considered an asset.
  • Experience with cloud data platforms, lakehouse patterns, object storage, relational databases, and data warehouse technologies.
  • Understanding of machine learning workflows, feature engineering, feature stores, model training data requirements, and dataset versioning.
  • Ability to implement data quality controls, validation tests, lineage, monitoring, access control, and governance-aware data workflows.
  • Ability to work with technical specifications, data contracts, architecture patterns, and engineering standards.
  • Experience working in specification-first, contract-driven, or Spec-Driven Development environments is considered an asset.
  • Strong problem-solving skills and ability to work in a fast-moving, delivery-focused environment.

Education:
  • Bachelor's degree in Computer Science, Software Engineering, Data Engineering, Artificial Intelligence, Data Science, or related field preferred.
  • Master's degree is considered an asset.


Additional Information

*This job description is not intended to be all inclusive. The candidate hired will also perform other reasonable related business duties as assigned by the supervisor. The company reserves the right to revise or change job duties as needed. This job description does not constitute a written or implied contract of employment.

By applying to this position you are confirming you possess either a Canadian citizenship, permanent resident status or valid work permit.

Please note: Reference Checks and Credit, Criminal Background Checks will be administered on suitably qualified candidates. Your application will be kept on file for up to two years.

$$ https://career5.successfactors.eu/career?company=gieseckede&career_job_req_id=27204&career_ns=job_application

Similar Jobs

More Jobs at Giesecke+Devrient

More Information Technology Jobs

Find similar AI Data Engineer jobs: