Job Summary
The Databricks Data Engineer will be responsible for designing, developing, and maintaining data ingestion and transformation pipelines within a Databricks Lakehouse environment. This role focuses on integrating data from multiple source systems, implementing scalable data architectures, and preparing high-quality datasets for downstream analytics and reporting. The ideal candidate will have strong experience with Databricks, Spark, Python, SQL, and modern data engineering practices.
Key Responsibilities
• Develop and maintain data ingestion pipelines that integrate data from multiple source systems.
• Design, build, and optimize ETL/ELT pipelines using Databricks, PySpark, and SQL.
• Implement and support Medallion Architecture layers, including Bronze, Silver, and Gold data models.
• Ensure data quality, validation, and consistency across data pipelines and datasets.
• Monitor, troubleshoot, and resolve data pipeline execution issues.
• Optimize pipeline performance, scalability, and reliability to support enterprise data workloads.
• Work with structured and semi-structured data formats from diverse source systems.
• Support data transformation, cleansing, enrichment, and integration activities.
• Collaborate with data analysts, data scientists, and business stakeholders to support analytics requirements.
• Maintain technical documentation related to data pipelines, architectures, and processes.
• Follow data engineering best practices for performance, security, governance, and operational support.
• Participate in code reviews, testing, and continuous improvement initiatives.
Required Qualifications
• 6+ years of experience in Data Engineering.
• Hands-on experience with Databricks and Apache Spark.
• Strong programming experience with Python.
• Strong SQL development and query optimization skills.
• Experience designing, developing, and maintaining ETL/ELT pipelines.
• Experience implementing and supporting Delta Lake architectures.
• Understanding of distributed data processing systems and large-scale data platforms.
• Experience working with structured and semi-structured data formats.
• Strong troubleshooting, performance tuning, and problem-solving skills.
• Experience monitoring and supporting production data pipelines.
• Ability to work effectively in a collaborative team environment.
• Strong communication and documentation skills.
Preferred Qualifications
• Experience implementing Medallion Architecture (Bronze, Silver, Gold) within Databricks environments.
• Experience working with cloud-based data platforms and modern data lake architectures.
• Experience supporting enterprise analytics, reporting, and business intelligence initiatives.
• Knowledge of data governance, data quality, and data lifecycle management best practices.
• Experience working in Agile development environments.