3-5 years of experience in data engineering, specifically with Apache Spark and Python
Proficiency in developing large-scale data processing systems
Strong understanding of data ingestion, transformation, and analysis
Ability to collaborate with cross-functional teams for data requirements
Experience in optimizing data workflows for performance and reliability
Responsibilities
Develop and maintain large-scale data processing systems using Apache Spark
Write efficient and reusable Python code for data operations
Collaborate with teams to understand and deliver data pipelines
Monitor and troubleshoot pipeline issues for optimal performance
Mentor junior team members on Python and Spark practices
Analyze data sets for trends to support business decisions
Stay updated on emerging technologies to enhance data infrastructure
Benefits
Opportunity to work with cutting-edge big data technologies
Collaborative environment with cross-functional teams
Mentorship opportunities for career development
Engagement in innovative projects that shape business decisions
Focus on maintaining high data quality and integrity
Full Job Description
Role description
Job Title: Big data Developer Work Location : Irving,Texas
Job Summary
Seeking a Senior Python Data Engineer with 3 to 5 years of experience specializing in Apache Spark and Python to design and implement scalable data solutions
Job Description
Develop and maintain largescale data processing systems using Apache Spark Write efficient reusable and testable Python code for data ingestion transformation and analysis Collaborate with crossfunctional teams to understand data requirements and deliver robust data pipelines Optimize data workflows for performance and reliability in a distributed computing environment Participate in code reviews and contribute to best practices for data engineering and Python development Ensure data quality and integrity through rigorous testing and validation processes
Roles and Responsibilities
Design build and deploy scalable data pipelines leveraging Apache Spark and Python Analyze complex data sets to identify trends patterns and insights to support business decisions Monitor and troubleshoot data pipeline issues to ensure high availability and performance Mentor junior team members and provide technical guidance on Python and Spark best practices Collaborate with data scientists analysts and other stakeholders to deliver datadriven solutions Stay updated with emerging technologies and recommend improvements to existing data infrastructure