Tata Consultancy Services

Databricks Data Engineer

Tata Consultancy Services$125K — $140K *
Enterprise Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Databricks Platform Expertise with tools like Delta Lake and MLflow.
  • Relevant Databricks Certification (Associate or Professional level).
  • Strong proficiency in PySpark for data transformation and analytics.
  • Expert-level proficiency in Python and advanced SQL skills.
  • Experience with cloud platforms: AWS, Azure, or GCP.

Responsibilities

  • Design and maintain scalable ETL/ELT data pipelines using PySpark and Databricks tools.
  • Process and transform batch and streaming data according to Medallion Architecture.
  • Implement data governance and security measures to protect PII.
  • Optimize Spark jobs and adjust configurations for efficiency.
  • Identify and address complexities and risks in data pipelines.
  • Collaborate with Data Scientists and Analysts for data curation and reporting.
  • Maintain comprehensive documentation and participate in code reviews.

Benefits

  • Comprehensive documentation and knowledge-sharing opportunities.
  • Collaborative environment with cross-functional teams.
  • Focus on data governance and security best practices.
Full Job Description
Roles & Responsibilities

Job Title: Databricks Data Engineer

Job Description:

We are seeking a highly skilled and motivated Databricks Certified Engineer to design, build, and optimize scalable data pipelines and ETL workflows using the Databricks Data Intelligence Platform. The ideal candidate will be responsible for writing robust Python and Spark code, ensuring data quality, and implementing data governance across cloud environments (AWS, Azure, or GCP). This role requires expertise in large-scale data processing, data warehousing principles, and cloud-native solutions.

Roles & Responsibilities:
• Pipeline Development: Design, build, and maintain scalable ETL/ELT data pipelines using PySpark, Delta Lake, Auto Loader, and Databricks Workflows.
• Data Transformation & Processing: Design and process batch and streaming data to support the Medallion Architecture (Bronze, Silver, Gold layers).
• Data Governance & Security: Implement access controls and data masking policies using Unity Catalog to secure Personally Identifiable Information (PII) and ensure compliance.
• Performance Tuning: Optimize Spark jobs, troubleshoot memory bottlenecks, and adjust cluster configurations for cost and compute efficiency.
• Proactive Risk Identification: Proactively identify and address underlying data complexities, hidden challenges, and potential risks within data pipelines and the Databricks ecosystem, ensuring robust, secure, and efficient data solutions.
• Cross-Functional Collaboration: Partner with Data Scientists and Analysts to curate datasets, support machine learning models (MLflow), and provide integrated reporting.
• Develop and maintain comprehensive documentation for data pipelines, data models, and ETL processes.
• Participate in code reviews to maintain high-quality code standards.
• Troubleshoot and resolve issues in data pipelines and Databricks clusters.

Qualifications:
• Primary Skill Set:

o Databricks Platform Expertise: In-depth knowledge of the Databricks Data Intelligence Platform, including notebooks, Delta Lake, MLflow, Unity Catalog, Auto Loader, and Databricks Workflows.

o Databricks Certification: Relevant Databricks certification (Associate or Professional level) validating foundational or advanced skills in the platform.
• Secondary Skill Set:

o PySpark: Strong proficiency in developing complex data transformations and analytics using PySpark.

o Apache Iceberg: Experience with Apache Iceberg for open table format management.
• Programming Languages:

o Python: Expert-level proficiency in Python for data manipulation, scripting, and application development.

o SQL: Advanced proficiency in SQL for data querying and manipulation.

o Shell Scripting: Experience with shell scripting for automation and job orchestration.
• Cloud Platforms: Hands-on experience with Databricks deployed on major cloud providers such as Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
• Big Data Concepts: Deep understanding of distributed computing, data warehousing principles, ETL/ELT processes, and data modeling.

Good to Have Skills
• DevOps Basics: Familiarity with CI/CD tools (e.g., Databricks Asset Bundles, GitHub Actions, GitLab) and orchestration tools like Apache Airflow.
• Data Warehousing: Knowledge of Hive for data storage and querying.
• Container Orchestration: Familiarity with Kubernetes for deploying and managing containerized applications.
• Version Control: Experience with Git or other version control systems.

Databricks Certification Levels

Depending on seniority, candidates may possess different levels of Databricks credentials:
• Associate Level: Validates foundational skills in writing Spark code, building SQL queries, and utilizing the Databricks workspace.
• Professional Level: Validates advanced skills for production environments, focusing on complex streaming workloads, CI/CD, data governance (Unity Catalog), and high-level performance optimization.

Salary Range: $125,000 to $140,000 per year

About Tata Consultancy Services

Tata Consultancy Services (TCS) is an Indian multinational information technology (IT) services and consulting company, headquartered in Mumbai, Maharashtra, India. It is a subsidiary of Tata Group and operates in 149 locations across 46 countries. TCS is the largest Indian company by market capitalization and is ranked 11th on the Forbes Global 2000 list of the world's biggest public companies. TCS is also the second-largest IT services company in the world by revenue and the largest employer of women in India. The company provides services in areas including IT, consulting, and business solutions.
Learn more about Tata Consultancy Services
Size
469,261 employees
Industry

Similar Jobs

More Jobs at Tata Consultancy Services

More Enterprise Technology Jobs

Find similar Databricks Data Engineer jobs: