Senior Databricks Data Engineer

Prophecy.io$120K — $150K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • Minimum 10 years of overall work experience
  • At least 5 years as a Data Engineer
  • Strong proficiency in SQL and data modeling
  • Extensive experience with cloud platforms and distributed systems
  • Familiarity with CI/CD pipelines and DevOps practices
  • Expertise in MySQL
  • Proficiency with Databricks and strong skills in PySpark

Responsibilities

  • Design and develop scalable ETL/ELT pipelines utilizing Databricks and PySpark
  • Build and maintain batch and real-time streaming ingestion frameworks
  • Develop reusable ingestion and transformation frameworks for consistency
  • Implement the Medallion architecture for data organization
  • Create incremental and Change Data Capture (CDC)-based ingestion pipelines
  • Design and implement real-time streaming pipelines with Kafka and Structured Streaming
  • Optimize Spark jobs, SQL queries, and streaming pipelines for performance

Benefits

  • Work in a cutting-edge technology environment
  • Opportunity to implement advanced data architectures
  • Focus on both batch and real-time data solutions
  • Engagement with modern tools like Databricks and Delta Lake
  • Collaboration with cross-functional teams in a distributed setting
  • Boost your career with exposure to CI/CD and DevOps practices
Full Job Description
Role Overview:

This role focuses on designing and developing robust, scalable ETL/ELT pipelines, with a strong emphasis on Databricks and PySpark. The successful candidate will be responsible for building both batch and real-time streaming data ingestion and transformation frameworks, implementing advanced data architectures, and optimizing data processing workflows for performance and efficiency.

Key Responsibilities:
  • Design and develop scalable ETL/ELT pipelines utilizing Databricks and PySpark.
  • Build and maintain batch and real-time streaming ingestion frameworks.
  • Develop reusable ingestion and transformation frameworks to ensure consistency and efficiency.
  • Implement the Medallion architecture (Bronze, Silver, Gold layers) for data organization.
  • Develop incremental and Change Data Capture (CDC)-based ingestion pipelines.
  • Design and implement real-time streaming pipelines using technologies like Kafka and Structured Streaming.
  • Optimize Spark jobs, SQL queries, and streaming pipelines for enhanced performance.
  • Implement Delta Lake-based ingestion and transformation frameworks.
  • Tune partitioning, caching, and Spark execution strategies to maximize throughput.

Required Skills:
  • Strong proficiency in SQL and data modeling.
  • Extensive experience with cloud platforms and distributed systems.
  • Familiarity with CI/CD pipelines and DevOps practices.
  • Expertise in MySQL.
  • Proficiency with Databricks.
  • Strong skills in PySpark.

Qualifications:
  • Minimum 10 years of overall work experience.
  • Minimum 5 years of experience specifically as a Data Engineer.
  • 8-10+ years of relevant experience is required.

Similar Jobs

More Jobs at Prophecy.io

More Information Technology Jobs

Find similar Senior Databricks Data Engineer jobs: