4-6 years of experience in ETL and data processing
Proficient in Unix, SQL, and Shell scripting
Strong expertise in Apache Spark and designing scalable pipelines
Hands-on experience with automation frameworks for testing
Ability to manage multiple modules and deliver on time
Experience with data validation frameworks and quality checks
Knowledge of CI/CD pipeline and coding standards
Responsibilities
Develop automation frameworks for testing 70% of the time
Conduct manual testing until full automation is achieved
Design and implement Spark-based data processing pipelines
Perform complex SQL queries and manage data workflows
Ensure timely data delivery and implement automated validation controls
Execute data quality checks and monitoring
Review code and adhere to source management standards
Benefits
Flexible work hours
Professional development opportunities
Collaborative team environment
cutting-edge technology and tools
Health and wellness programs
Full Job Description
Role description
Job Title: Pyspark Developer
Work Location : Irving,Texas
Job Summary
This role is for Quality Engineering team where 70percent of effort will be for developing automation frameworks for testing Remaining 30percent effort will be on manual testing until its fully automated
Exp 4 to 6 Years Must have good technical experience and should be able to provide technical solutions for multiple modules in parallel on need basis and bring the task to closure on time
Unix SQL and Shell Scripting experience is a must have
Expertise in Designing and developing scalable Apache spark ETL based Data processing pipelines
Strong commandline knowledge in UnixLinux with Shell scripting using Bash Kornshell or Perl and File processing using awk scripts
Expertise in SQL querying and complex joins
Implementing comprehensive Spark based Data validation frameworks transforming large volumes of Financial data within the Project lifecycle
Expertise with complex Data workflows with Apache AirFlow managing task dependencies SLAs etc to ensure timely data delivery and corresponding automated validation controls
Strong Analytical skills and expertise on SparkSQL for Data analysis and validation ensuring the delivery of clean queryready datasets for business consumption
Expertise in Data quality checks and monitoring
Handson with Automation Framework Design for ETL and API
SME in Data Analysis Database testing Messaging queues