We’re tackling the very difficult problem of how to provide dynamic and unpredictable analytics efficiently and at scale, with a set of clients who all have very different data models and organizational needs. We’re very excited not just to continue to improve our existing product, but also to create innovative new technology that fundamentally changes the way the world thinks about cloud data pipelines operating on massive datasets.
You will work directly with clients to write custom ETL pipelines to onboard their data into the MarketDial application. Additionally, you'll collaborate with a small team of data scientists and software developers to build and continuously release features that increase the capability of our architecture. This includes extending our backend analytics jobs, ETL pipelines, and data APIs to provide accurate timely results and access to billions of customer transactions.
- Build analytics tools (Python based) that process customer data to provide timely results to our frontend applications.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Implement processes and systems to monitor data quality, ensuring production data is always accurate and available.
- Work closely with design, product, and operations to own our backend data systems through their entire lifecycle.
- Strong Python and SQL skills.
- Experience with cloud-native infrastructure and tools (Linux, Docker, Kubernetes).
- A DevOps mindset for continuously delivering value in a collaborative team. Comfortable with Git/Github processes, code review, and CI/CD build automation.
- Automated unit, integration, data quality testing experience.
- Passion for software craftsmanship and clean code.
- A love of continuous learning, improvement, and delivery.
- Comfort in working and communicating with senior executives as well as business analysts.
Nice to have:
- Google Cloud Platform experience (BigQuery, Cloud Storage, PubSub, GKE).
- Experience with statically-typed languages, especially those relevant to data engineering (Scala/Kotlin/Java, Rust, Go).
- Cloud data pipeline tools experience (Apache Beam, Scio, Spark, etc).
- Statistics and machine learning experience.