Job Summary
The Azure Cloud Data Engineer is responsible for designing, developing, and maintaining scalable cloud-based data platforms and data pipelines within Azure environments. This role focuses on building high-performance ETL pipelines, developing Databricks and PySpark solutions, optimizing large-scale data processing, and implementing enterprise data warehouse architectures. The ideal candidate will have strong expertise in Python, SQL, Azure Databricks, cloud data engineering, and enterprise data platform modernization.
Key Responsibilities
• Design, develop, and maintain scalable data pipelines for data ingestion, transformation, and loading from multiple data sources.
• Build and deploy Spark applications using Azure Databricks and PySpark to process large-scale datasets.
• Optimize Databricks jobs for performance, scalability, and reliability.
• Monitor, troubleshoot, and resolve issues impacting data pipelines and Databricks workloads.
• Design and implement enterprise data warehouse solutions and dimensional data models.
• Develop solutions utilizing Delta Lake, cloud data integration, and modern data architecture principles.
• Develop and maintain applications using Python and SQL.
• Create and optimize complex SQL queries, stored procedures, and reverse engineer existing database processes.
• Design and implement data models supporting OLTP, OLAP, dimensional modeling, facts, and dimensions.
• Process structured, semi-structured, and unstructured data.
• Develop event-driven and streaming data ingestion solutions.
• Implement best practices for data governance, security, quality, and cloud data management.
• Perform code reviews to ensure compliance with development standards and optimal execution patterns.
• Support enterprise cloud data platform migrations and modernization initiatives.
• Collaborate with cross-functional teams to design scalable cloud-based data architectures.
Required Qualifications
• Bachelor's degree in Computer Science, Engineering, or a related technical field.
• 9+ years of overall data engineering experience.
• 5+ years of hands-on Python development experience.
• 5+ years of SQL Server development experience working with large datasets.
• 5+ years of experience developing and deploying ETL pipelines using Databricks and PySpark.
• Experience with Delta Lake, data integration, cloud architecture, and data modeling.
• Experience with enterprise data warehouse design and dimensional modeling.
• Experience processing structured, semi-structured, and unstructured data.
• Experience with event-driven and streaming data processing technologies.
• Experience with cloud data warehouse platforms such as Azure Synapse, Snowflake, Amazon Redshift, or Google BigQuery.
• Strong understanding of OLTP, OLAP, facts, dimensions, and enterprise data warehousing concepts.
• Experience designing and implementing enterprise cloud data architectures.
• Experience leading enterprise cloud data platform migration initiatives.
• Experience working with banking or financial services clients.
• Strong analytical, troubleshooting, and problem-solving skills.
• Excellent written and verbal communication skills.
Preferred Qualifications
• Master's degree in Computer Science, Engineering, or a related technical field.
• Experience with Apache Airflow.
• Experience with cloud-based messaging and analytics platforms.
Certifications
• Cloud certification(s).