4-6 years of experience in development and quality engineering
Proficiency in Unix, SQL, and Shell scripting
Strong expertise in Apache Spark for ETL data processing
Solid command of Bash/Kornshell or Perl scripting
Experience with complex SQL queries and data workflows
Familiarity with data quality checks and monitoring
Hands-on experience in Automation Framework Design for ETL and API.
Responsibilities
Develop and automate testing frameworks for quality engineering
Perform manual testing until full automation is achieved
Design scalable data processing pipelines using Apache Spark
Implement data validation frameworks for large financial datasets
Manage data workflows and task dependencies with Apache Airflow
Conduct data analysis and ensure datasets are business-ready
Collaborate on coding standards, code reviews, and CI/CD processes.
Benefits
Opportunity to work within a dynamic Quality Engineering team
Engagement with large-scale data processing and financial data validation
Hands-on experience with cutting-edge technologies like Apache Spark and Airflow
Long-term potential for role evolution from manual to automated testing
Additional professional development opportunities in a tech-driven environment.
Full Job Description
Role description
Job Title: Pyspark Developer
Work Location : Irving,Texas
Job Summary
This role is for Quality Engineering team where 70percent of effort will be for developing automation frameworks for testing Remaining 30percent effort will be on manual testing until its fully automated
Exp 4 to 6 Years Must have good technical experience and should be able to provide technical solutions for multiple modules in parallel on need basis and bring the task to closure on time
Unix SQL and Shell Scripting experience is a must have
Expertise in Designing and developing scalable Apache spark ETL based Data processing pipelines
Strong commandline knowledge in UnixLinux with Shell scripting using Bash Kornshell or Perl and File processing using awk scripts
Expertise in SQL querying and complex joins
Implementing comprehensive Spark based Data validation frameworks transforming large volumes of Financial data within the Project lifecycle
Expertise with complex Data workflows with Apache AirFlow managing task dependencies SLAs etc to ensure timely data delivery and corresponding automated validation controls
Strong Analytical skills and expertise on SparkSQL for Data analysis and validation ensuring the delivery of clean queryready datasets for business consumption
Expertise in Data quality checks and monitoring
Handson with Automation Framework Design for ETL and API
SME in Data Analysis Database testing Messaging queues