3-5 years of experience in Python and Apache Spark
Strong understanding of data processing and ETL methodologies
Experience with scalable data solutions design and implementation
Proficient in writing reusable and efficient Python code
Knowledge of distributed computing environments and data workflows
Responsibilities
Develop and maintain large-scale data processing systems using Apache Spark
Write efficient Python code for data ingestion, transformation, and analysis
Collaborate with cross-functional teams to gather data requirements
Optimize data workflows for performance and reliability
Monitor and troubleshoot issues in data pipelines
Mentor junior team members in Python and Spark best practices
Stay updated on emerging technologies to enhance data infrastructure
Benefits
Opportunity for professional growth and mentorship
Engagement with cross-functional teams and diverse projects
Work in an innovative environment focused on technology advancements
Exposure to cutting-edge data engineering practices
Full Job Description
Role description
Job Title: Big data Developer Work Location : Mississauga , Canada
Job Summary
Seeking a Senior Python Data Engineer with 3 to 5 years of experience specializing in Apache Spark and Python to design and implement scalable data solutions
Job Description
Develop and maintain largescale data processing systems using Apache Spark Write efficient reusable and testable Python code for data ingestion transformation and analysis Collaborate with crossfunctional teams to understand data requirements and deliver robust data pipelines Optimize data workflows for performance and reliability in a distributed computing environment Participate in code reviews and contribute to best practices for data engineering and Python development Ensure data quality and integrity through rigorous testing and validation processes
Roles and Responsibilities
Design build and deploy scalable data pipelines leveraging Apache Spark and Python Analyze complex data sets to identify trends patterns and insights to support business decisions Monitor and troubleshoot data pipeline issues to ensure high availability and performance Mentor junior team members and provide technical guidance on Python and Spark best practices Collaborate with data scientists analysts and other stakeholders to deliver datadriven solutions Stay updated with emerging technologies and recommend improvements to existing data infrastructure