EPAM Systems

Senior Data Engineer, Databricks/Kafka

EPAM Systems$120K — $150K *
US-AnywhereRemote in Georgia, US
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of data engineering experience, focusing on ETL development and big data technologies.
  • Expertise in designing and optimizing ETL/ELT workflows using industry-standard tools.
  • Hands-on experience with cloud-native platforms like Azure Synapse and Databricks.
  • Familiarity with data modeling techniques, including Data Vault 2.0 and normalization.
  • Strong programming skills in SQL and languages like Python, Scala, or Java.
  • Knowledge of Infrastructure as Code practices for cloud management.
  • Understanding of data security and governance best practices.

Responsibilities

  • Architect and build scalable data pipelines for efficient data processing.
  • Develop and implement robust ETL/ELT workflows for data ingestion and transformation.
  • Leverage data modeling techniques to enhance flexibility and scale.
  • Ensure data consistency and reliability across various storage solutions.
  • Optimize data infrastructure on Azure for performance and cost.
  • Implement big data frameworks like Databricks and Kafka for processing.
  • Collaborate with engineering teams to align data architecture with business goals.

Benefits

  • Opportunity to mentor junior engineers and promote knowledge sharing.
  • Work within a federated engineering environment for integrated project execution.
  • Access to cutting-edge tools and technologies for data engineering improvements.
  • Engagement in continuous learning with focus on emerging technologies.
  • Opportunity to impact analytics, ML, and BI applications with reliable data.
Full Job Description
Senior Data Engineer, Databricks/Kafka We are looking for a Senior Data Engineer to design, implement and optimize scalable data pipelines and infrastructure. You will play a key role in ensuring data availability, reliability and quality for analytics, machine learning and business intelligence applications. Your work will involve collaborating with data scientists, analysts and software engineers to develop robust ETL/ELT workflows, real-time streaming solutions and cloud-native data architectures. Responsibilities Architect, build and optimize scalable data pipelines for batch and real-time data processing Develop and implement ETL/ELT workflows, ensuring efficient data ingestion, transformation and storage Leverage modeling (Party model, Datavault) methodologies to enable scalable and flexible data modeling Ensure data consistency, reliability and governance across data lakes, warehouses and operational data stores Optimize performance and cost efficiency of data infrastructure on Azure Implement and manage big data processing frameworks, such as Databricks, Kafka Enhance data security and compliance, integrating RBAC & ABAC, encryption and regulatory frameworks (GDPR) into data infrastructure Develop automation tools for data pipeline orchestration, using Airflow, Azure Data Factory or Prefect Monitor, troubleshoot and optimize data pipelines, ensuring minimal downtime and quick recovery from failures Collaborate with federated engineering teams to align data architecture with business and engineering goals Provide clean, reliable and scalable datasets for advanced analytics and machine learning in partnership with data scientists and business analysts Evaluate and adopt emerging technologies, ensuring continuous improvement in data engineering best practices Mentor and guide junior engineers, fostering technical excellence and knowledge sharing within the team Requirements 5+ years of experience in data engineering, ETL development or big data technologies Expertise in designing and optimizing ETL/ELT workflows, using tools such as dbt, Airflow, Azure Data Factory or Apache NiFi Hands-on experience with cloud-native data platforms, including Azure Synapse, Databricks, Snowflake or BigQuery Knowledge of data modeling techniques, including Data Vault 2.0, star schema and normalization strategies Experience with large-scale distributed computing frameworks (Apache Spark, Hadoop, Kafka, Event Hub) Advanced proficiency in SQL and programming languages, such as Python, Scala or Java Understanding of Infrastructure as Code (Terraform, Pulumi, ARM templates) for managing cloud-based data infrastructure Skills in data security and governance best practices, including RBAC, encryption and data lineage Experience working in a federated engineering environment, ensuring seamless integration across teams Proficiency in observability and monitoring tools for data pipelines, such as Monte Carlo, DataDog and Great Expectations Familiarity with Agile and DevSecOps methodologies, ensuring continuous integration, deployment and monitoring of data solutions Bachelor's or Master's degree in Computer Science, Data Engineering, Information Systems or a related field Upper-Intermediate English language proficiency (B2+)

About EPAM Systems

EPAM Systems, Inc. is a leading global provider of digital platform engineering and development services. The company has a strong presence in North America, Europe, and Asia, and serves clients in a variety of industries, including financial services, healthcare, and retail. EPAM's services include software engineering, product development, and digital platform engineering, and the company has a reputation for delivering high-quality solutions that help its clients achieve their business goals. EPAM has been recognized as a leader in the digital services industry by a number of independent research firms, and the company has won numerous awards for its work.
Learn more about EPAM Systems
Size
58,824 employees
Market Cap
$18.2 billion
Industry
Net Income
$327.1 million
Founded
1993
5 Year Trend
+26.5%
Revenue
$2.6 billion
NASDAQ

Similar Jobs

More Jobs at EPAM Systems

More Information Technology Jobs

Find similar Senior Data Engineer, Databricks/Kafka jobs: