Senior Consultant, Data Engineer (Python/PySpark/AWS)

Infinitive Inc

$90K — $154K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's or Master's degree in Computer Science or related field.
  • Proven experience as a Data Engineer or similar role.
  • Strong programming skills in Python and expertise in PySpark.
  • Hands-on experience with ETL tools and processes.
  • Familiarity with CI/CD tools like Jenkins or GitHub Actions.
  • Solid understanding of data modeling and database design.
  • Excellent problem-solving and communication skills.

Responsibilities

  • Collaborate with teams to design data architecture solutions.
  • Develop data models to optimize storage and retrieval.
  • Implement ETL processes for extracting, transforming, and loading data.
  • Ensure data quality and integrity throughout the ETL pipeline.
  • Utilize Python and PySpark for data processing scripts.
  • Integrate data from multiple systems for a unified analytical view.
  • Design streaming and batch workflows for data processing.
  • Implement CI/CD pipelines to ensure reliable data pipelines.

Benefits

  • Opportunity to work in a dynamic team environment.
  • Engagement in cutting-edge data engineering projects.
  • Collaboration with skilled professionals in the field.
  • Professional development and skill enhancement opportunities.
  • Potential for flexible working arrangements.
Full Job Description
*Candidates must be local to the Washington D.C. metro area. We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will play a crucial role in designing, developing, and maintaining our clients data infrastructure. Your expertise in Python, PySpark, ETL processes, CI/CD (Jenkins or GitHub), and experience with both streaming and batch workflows will be essential in ensuring the efficient flow and processing of data to support our clients. Responsibilities: Data Architecture and Design: - Collaborate with cross-functional teams to understand data requirements and design robust data architecture solutions. - Develop data models and schema designs to optimize data storage and retrieval. ETL Development: - Implement ETL processes to extract, transform, and load data from various sources. - Ensure data quality, integrity, and consistency throughout the ETL pipeline. Python and PySpark Development: - Utilize your expertise in Python and PySpark to develop efficient data processing and analysis scripts. - Optimize code for performance and scalability, keeping up-to-date with the latest industry best practices. Data Integration: - Integrate data from different systems and sources to provide a unified view for analytical purposes. - Collaborate with data scientists and analysts to implement solutions that meet their data integration needs. Streaming and Batch Workflows: - Design and implement streaming workflows using PySpark Streaming or other relevant technologies. - Develop batch processing workflows for large-scale data processing and analysis. CI/CD Implementation: - Implement and maintain continuous integration and continuous deployment (CI/CD) pipelines using Jenkins or GitHub Actions. - Automate testing, code deployment, and monitoring processes to ensure the reliability of data pipelines. Qualifications: - Bachelor's or Master's degree in Computer Science, Information Technology, or a related field. - Proven experience as a Data Engineer or similar role. - Strong programming skills in Python and expertise in PySpark for both batch and streaming data processing. - Hands-on experience with ETL tools and processes. - Familiarity with CI/CD tools such as Jenkins or GitHub Actions. - Solid understanding of data modeling, database design, and data warehousing concepts. - Excellent problem-solving and analytical skills. - Strong communication and collaboration skills. Preferred Skills: - Knowledge of cloud platforms such as AWS, Azure, or Google Cloud. - Experience with version control systems (e.g., Git). - Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes). - Understanding of data security and privacy best practices. Infinitive is required by law in some jurisdictions to include a reasonable estimate of the compensation range for this role. The determination of this range includes various factors not limited to skill set, level, experience, relevant training, and licensure and certifications. Compensation decisions are dependent on the facts and circumstances of each case. A reasonable estimate of the current range for this role in the U.S. is $90,000.00 - $154,00.00.

Similar Jobs

More Jobs at Infinitive Inc

More Information Technology Jobs

Find similar Senior Consultant, Data Engineer (Python/PySpark/AWS) jobs: