$80K — $100K *
As a Data Engineer, you help us build secure, scalable and reliable infrastructure and pipelines to support our growing data and Machine Learning needs. In this role, you will work as part of a small multi-disciplinary team to establish and evangelize the data and AI driven culture. This team will not only work with the product team closely but will also support non-engineering functions like Marketing and Business Strategy.
To thrive in this role, you are someone who works well in cross-functional teams and enjoys collaborating. Furthermore, you understand the business impact of your work and enjoy measuring and presenting it. You enjoy working with product management and other stakeholders to find the best solution to the problem at hand, iterate over it and can balance technical complexity with delivering customer value timely.
Our platform and applications run on Google Cloud. Our current pipelines are built with Airflow and Python and use MongoDB and PostgreSQL as the data store and warehouse. We are building a next generation DataLake on GCP. You will be working on building infrastructure for real-time and batch pipelines to ingest and transform data from a variety of sources. You will have an opportunity to experiment with new frameworks and paradigms, and freedom to put cutting-edge tech in production to shape the future of digital health!
In this role, you will:
Translate business needs into data and analytics requirements, with attention to detail
Utilize a variety of distributed computing platforms and tools to build scalable ETL / machine learning pipelines to meet the team's objective
Analyze, tune, troubleshoot and support the data infrastructure ensuring the performance, integrity, and security of data
Use sound agile development practices (testing and code reviewing, etc.) to develop and deliver data products
Build and maintain consistent, secure and performant data pipelines
Provide tooling for product analytics and data-science
Experience with software development on data pipelines and big-data technologies
You have experience in a cloud Data Stack - preferably GCP
Experience in orchestration tools like Airflow
Experience in developing pipelines using python
Experience in implementing deployments, error handling and monitoring for scalable pipelines
Experience in performing root cause analysis of production issues , performance tuning and optimization
We believe that diverse teams perform better, and that fostering an inclusive work environment is a key part of growing a successful team. We welcome people of diverse backgrounds, experiences, and perspectives. We are an equal opportunity employer, and we are committed to working with applicants requesting accommodation at any stage of the hiring process.
Valid through: 1/20/2021