Database Administration, DevOps & Site Reliability
Here at Prodigy, we are working hard to achieve our mission of helping every child in the world to LOVE learning. Our Data team is scaling rapidly as we continue to hit our product and growth milestones, and it’s an exciting time for the company! Our data team is responsible for all aspects of data ingestion, storage, transformation, and analyzation using modern tools and environments such as Spark, Airflow, Snowflake, Periscope, Kinesis, AWS. You will be working alongside our development and data science teams to help build and manage all of our data pipelines.
Create and manage key business data pipelines
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data acquisition, data-related technical issues and other analytics needs.
Work cross-functionally to explore and propose solutions to business problems that can be addressed using insights from data
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Support the development of new solutions for batch, real-time data, and analytics use cases that align with business requirements
Maintain and troubleshoot the infrastructure built for optimal extraction, transformation, and loading of data from a wide variety of data sources
Identify, design, and implement process improvements: automate manual processes, optimize data delivery, improve data reliability, efficiency, and quality, etc.
Build analytics tools that utilize the data pipeline to provide actionable insights into student learning, customer acquisition, operational efficiency, and other key metrics.
Who You Are:
Working familiarity with a variety of different storage mechanisms including SQL & NoSQL databases, Data Warehouses, and Data Lakes.
Experience working with AWS Cloud platforms and related systems
Experience building and optimizing data pipelines, architectures, and data sets.
Experience with big data tools: Databricks, Spark, Kafka, etc.
Experience with data pipeline and workflow management tools, such as Airflow
Experience with real-time data processing and stream-processing systems: Kinesis, Spark-Streaming, etc.
Experience in requirements analysis, design, implementation, and testing of software solutions, especially data related, using Python, Scala, and/or other programming languages
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
Experience implementing backend Data APIs using NodeJS, Python, or other programming languages
Strong project management, organizational and communication skills.
University degree in Engineering, Computer Science, Stats, Mathematics. Graduate degree in Data Science related discipline is a strong asset