JOB DESCRIPTION
We are seeking a high-caliber Senior Databricks Engineer to lead the architecture, development, and optimization of our next-generation Lakehouse platform. This is a critical role for a technical leader with 6+ years of deep data engineering expertise, specifically within the Databricks ecosystem. The ideal candidate will drive technical direction, establish robust data governance, and deliver high-impact, scalable data solutions that bridge the gap between raw data and actionable business intelligence.
JOB RESPONSIBILITIES
Data Pipeline Development & Management
- Ingestion & Transformation: Design and optimize high-volume ETL/ELT pipelines using Delta Live Tables (DLT) and PySpark, ensuring data integrity across the Bronze, Silver, and Gold layers.
- Workflow Orchestration: Develop and maintain sophisticated pipelines using Databricks Workflows or Airflow, focusing on modularity, reusability, and automated error handling.
- Streaming & Real-time Integration: Implement real-time data flows utilizing Structured Streaming and Kafka/Event Hubs to enable immediate data availability for downstream consumption.
- Data Security & Privacy: Enforce data anonymization and fine-grained access controls to ensure compliance with global regulations (GDPR/CCPA/HIPAA).
- DataOps & DevOps: Implement CI/CD patterns using Databricks Asset Bundles (DABs), Terraform, and Git to automate environment parity and deployments.
Data Ecosystem Management & Monitoring
- Open Table Formats: Manage and optimize Delta Lake storage, utilizing advanced features like Liquid Clustering, Z-Ordering, and Change Data Feed (CDF).
- Compute Engine Optimization: Drive cost efficiency and performance by optimizing Spark configurations, Photon engine utilization, and Serverless SQL Warehouses.
- Observability & Monitoring: Integrate comprehensive monitoring and alerting (e.g., Databricks System Tables, Grafana, or Splunk) to rapidly identify bottlenecks and troubleshoot production issues.
JOB QUALIFICATIONS
- 6+ Years of hands-on, progressive experience in Data Engineering, with at least 5 years focused heavily on the Databricks platform.
- Architectural Understanding: Expert knowledge of Medallion Architecture, Data Vault 2.0 or Dimensional Modeling, and modern Lakehouse design patterns.
- Scale Expertise: Proven track record of building and managing large-scale data infrastructure (Petabyte-scale) in cloud-native environments.
- Industry Experience: Experience in the Insurance or Financial Services industry is preferred (focusing on claims, policy, or risk data).
- Technical Toolset:
- Cloud Environment: Azure (preferred), AWS,.
- Databricks Stack: Unity Catalog, Delta Live Tables, Databricks SQL, MLflow.
- Core Languages: Expert-level SQL, Python, and PySpark.
- Supporting Tools: dbt (Databricks adapter), Git, and Orchestration tools