Full Job Description
We are seeking a Lead Databricks Engineer/Architect to design, build, and scale our cloud-based lakehouse platform. In this role, you will own the end-to-end architecture of our data ecosystem on Databricks, partner with data science and analytics teams to productionize ML and analytical workloads, and set the technical direction for ingestion, transformation, governance, and performance optimization across petabyte-scale datasets. You will be a hands-on technical leader: writing production code, mentoring engineers, and shaping standards that the broader data organization will adopt.
Information Security Responsibilities
• Promote and enforce awareness of key information security practices, including acceptable use of information assets, malware protection, and password security protocols
• Identify, assess, and report security risks, focusing on how these risks impact the confidentiality, integrity, and availability of information assets
• Understand and evaluate how data is stored, processed, or transmitted, ensuring compliance with data privacy and protection standards (GDPR, CCPA, etc.)
• Ensure data protection measures are integrated throughout the information lifecycle to safeguard sensitive information
Role and Responsibilities
• Architect and lead the implementation of an enterprise lakehouse on Databricks (Delta Lake, Unity Catalog, Photon, Workflows) across one or more major clouds (AWS, Azure, or GCP).
• Design scalable batch and streaming data pipelines using PySpark, Spark SQL, Structured Streaming, and Delta Live Tables; establish patterns for ingestion from operational systems, event streams, and third-party APIs.
• Define and enforce platform standards for data modeling (medallion architecture), CI/CD, code quality, testing, observability, and cost optimization.
• Lead the governance strategy using Unity Catalog - fine-grained access control, data lineage, audit, and PII handling - in partnership with security and compliance.
• Optimize Spark workloads for performance and cost: cluster sizing, Photon, autoscaling, file layout, Z-ordering, caching, and query tuning.
• Partner with ML engineers and data scientists to operationalize models using MLflow, feature stores, and model serving on Databricks.
• Own the cloud infrastructure footprint for the platform: networking, IAM, secrets, encryption, and Terraform/IaC for Databricks workspaces and supporting services.
• Mentor a team of data engineers; lead architecture reviews, code reviews, and technical design sessions; raise the bar on engineering practices.
• Engage with stakeholders across analytics, product, and finance to translate business needs into a roadmap for the data platform.
Preferred Qualifications
• 8+ years of data engineering experience, with 4+ years building production workloads on Databricks.
• Deep expertise in Apache Spark (PySpark and Spark SQL) - including performance tuning, partitioning strategy, and the Catalyst/Photon execution model.
• Strong hands-on experience with Delta Lake, Unity Catalog, Databricks Workflows, and Delta Live Tables.
• Production experience on at least one major cloud (AWS, Azure, or GCP), including networking, IAM, storage (S3/ADLS/GCS), and compute primitives.
• Proficiency in Python and SQL; comfort with Scala is a plus.
• Experience designing medallion (bronze/silver/gold) architectures and dimensional models for analytics.
• Strong CI/CD and DevOps practice: Git, Terraform, Databricks Asset Bundles or dbx, automated testing of data pipelines.
• Track record of leading technical projects end-to-end and mentoring engineers.
• Excellent written and verbal communication; able to drive alignment with both engineering and business stakeholders.
$102,000 - $133,000 a year
Individual pay is determined by many factors, including experience, relevant education or training, and organizational needs. The mid-range to maximum of the salary range is generally reserved for individuals who are highly experienced in the role.
We invite you to stay connected with us by subscribing to our monthly job openings alert here.
#BI-Remote
#LI-Remote