Specialist - Data Engineering

LTIMindtree • $90K — $130K *

Irving, TX 75061In-Person

Information Technology

Less than 5 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5-7 years of experience in data engineering with a focus on PySpark and Apache Spark.
Proficient in Python programming and the PySpark API for data transformations.
Strong knowledge of HiveQL and ANSI SQL with experience in schema management.
Experience with optimized big data storage formats such as Parquet, ORC, and Avro.
Hands-on development experience with major cloud services like AWS, Azure, or GCP.
Understanding of data warehousing concepts, including dimensional modeling and data lakes.

Responsibilities

Design, build, and maintain scalable ETL/ELT pipelines using PySpark and Spark SQL.
Deploy and manage cloud-based data infrastructure on platforms such as AWS and Azure.
Optimize data storage through strategic layout and indexing in Apache Hive and data lakes.
Identify and resolve performance bottlenecks in Spark jobs using Spark UI.
Develop solutions for ingesting high-volume datasets from relational databases and flat files.
Implement automated workflows using tools like Apache Airflow for reliable data delivery.
Collaborate with data scientists and business analysts to translate business requirements into data solutions.

Benefits

Opportunity to work with cutting-edge technologies in a dynamic cloud environment.
Collaborative work culture with cross-functional teams.
Exposure to advanced analytics and business intelligence initiatives.
Professional development opportunities, including cloud certifications.
Flexible work arrangements in a progressive tech hub like Irving, Texas.

Full Job Description

Role description

Job Title: Pyspark Developer

Work Location : Irving, Texas

Job Summary

We are seeking a highly skilled and motivated Data Engineer to play a pivotal role in designing building and optimizing our next generation scalable data pipelines This position requires expertise in processing massive datasets using cutting-edge technologies like Apache Spark PySpark and Hive within a dynamic cloud environment Your primary objective will be to ensure the utmost data reliability speed and efficiency providing a robust foundation for downstream business intelligence and advanced analytics initiatives
Key Responsibilities
Data Pipeline Development Maintenance Design build and maintain highly scalable and efficient ETLELT data pipelines utilizing PySpark and Spark SQL for complex data transformations
Cloud Data Infrastructure Management Deploy manage and scale critical data infrastructure components on leading cloud platforms such as Amazon Web Services AWS eg EMR Glue Microsoft Azure eg Databricks Synapse or Google Cloud Platform GCP
Data Warehousing Storage Optimization Strategically manage data layout partitioning and indexing within Apache Hive and various cloud data lake solutions to optimize performance and accessibility
Performance Tuning Optimization Proactively identify and resolve performance bottlenecks in Spark jobs leveraging Spark UI for indepth analysis effectively managing data skewness and optimizing memory utilization
Diverse Data Integration Develop robust solutions for ingesting highvolume and diverse datasets from both structured relational databases and unstructured flat files into our data ecosystem
Automated Workflow Orchestration Implement and manage automated data workflows using industrystandard scheduling tools like Apache Airflow or platformnative schedulers ensuring timely and reliable data delivery
Strategic Collaboration Partner closely with data scientists business analysts and crossfunctional enterprise teams to translate complex business requirements into technically sound and efficient data solutions
Required Core Technical Skills
Big Data Frameworks Expertise Demonstrated high proficiency in Apache Spark architecture including a deep understanding of drivers executors and Directed Acyclic Graphs DAGs
Advanced Programming Exceptional coding skills in Python and extensive experience with the PySpark API for developing intricate data transformations and processing logic
Querying Schema Management Strong command of HiveQL and ANSI SQL coupled with expertise in data partitioning techniques and effective schema definition
Optimized Storage Formats Indepth understanding and practical experience with optimized big data storage file formats such as Parquet ORC and Avro
Cloud Ecosystem Development Handson development experience utilizing cloudnative big data utilities eg AWS EMR Azure Databricks within major cloud platforms
Data Warehousing Fundamentals Solid foundation in Dimensional Data Modeling including Star and Snowflake schemas and practical experience with Data Lakes concepts and implementation
Preferred Qualifications
CICD DevOps Automation Experience with Continuous IntegrationContinuous Deployment CICD practices and automation tools like Git Jenkins or Ansible
NoSQL Database Integration Exposure to and experience with NoSQL databases such as HBase Cassandra or MongoDB
Professional Cloud Certifications Relevant professional cloud certifications eg AWS Certified Data Engineer Microsoft Certified Azure Data Engineer Associate are highly valued

About LTIMindtree

LTIMindtree is a global technology consulting and digital solutions company that enables enterprises across industries to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies. As a digital transformation partner to more than 700+ clients, LTIMindtree brings extensive domain and technology expertise to help drive superior competitive differentiation, customer experiences, and business outcomes in a converging world. Powered by nearly 90,000 talented and entrepreneurial professionals across 30+ countries, LTIMindtree — a Larsen & Toubro Group company — combines the industry-acclaimed strengths of erstwhile Larsen and Toubro Infotech and Mindtree in solving the most complex business challenges and delivering transformation at scale.

Learn more about LTIMindtree

Industry

Information Technology

* Ladders Estimates

Similar Jobs

Data Engineer
$95K — $124K *
MSCI Inc.
Norman, OK 73072 (Cleveland County)
Today
Software Engineer II - Data Platform
$116K — $162K *
Pantheon Systems, Inc
Remote
Today
Software Engineer II - Data Platform
$103K — $129K *
Pantheon Systems, Inc
Remote
Reposted Today
Data Engineer
$90K — $120K *
HelioCampus
Bethesda, MD 20814 (Montgomery County)
Reposted Today
Software Engineer, Data (L1)
$90K — $120K *
Acrisure
Austin, TX 78745 (Travis County)
Today
Senior Technical Consultant - Data Conversion
$117K — $175K *
Workday
Remote
Today

Get Ready For Your
Next Interview

More Jobs at LTIMindtree

Associate Principal - Architecture
$100K — $130K *
Mississauga, ON L4T 0A1
Reposted Today
Enterprise Technology
In-Person
Associate Principal - Business Analysis
$90K — $120K *
Mississauga, ON L4T 0A1
Reposted Today
Business Services
In-Person
Associate Principal - Consulting
$100K — $130K *
Mississauga, ON L4T 0A1
Today
Business Services
In-Person
Senior Specialist - Package Implementation
$90K — $120K *
Toronto, ON M3C 0E3
Reposted Today
Technical Services
In-Person
Program Manager - AI
$90K — $130K *
Toronto, ON M3C 0E3
Today
Enterprise Technology
In-Person

More Information Technology Jobs

SDET (Software Development Engineer In Test)
Confidential Company
Washington, DC 20001 (District Of Columbia County)
2 weeks ago
Mobile Application Developer
$120K — $130K *
Bixal
Fairfax, VA 22033 (Fairfax County)
Reposted Today
Safety and Location Software Engineer
$120K — $160K *
Apple
Cupertino, CA 95014 (Santa Clara County)
Today
Senior Manager, IT Operations
$130K — $180K *
Airwallex Limited
San Francisco, CA 94112 (San Francisco County)
Today
AI Security Engineer
$134K — $171K *
Emcor UK
Norwalk, CT 06854 (Western Ct County)
Today

Find similar Specialist - Data Engineering jobs:

Nationwide Irving, TX

Specialist - Data Engineering

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar Specialist - Data Engineering jobs:

Get Ready For Your
Next Interview