We’re seeking for exceptional Data Engineer to join our team, someone who can not only design machine-based systems, but also think creatively about the human interactions necessary to augment and train those systems.
As an ML Engineer, you will:
- Develop data infrastructure to ingest, sanitize and normalize a broad range of internal and external data such as journal citations, clinical trial data, and gene information
- Build infrastructure to help us not only scale up data ingest, but large-scale cloud-based machine learning
- Work with a range of structured and unstructured data sources
- Design innovative data-acquisition and labeling systems, leveraging tools & techniques like crowdsourcing and novel active learning approaches
We are looking for someone who has:
- Experience with deep learning frameworks like TensorFlow or PyTorch
- Industry or academic experience working on a range of ML problems, particularly NLP
- Expert software development skills with a focus for building sound and scalable ML. Working experience with Python, Perl, and etc.
- Excitement about taking cutting-edge technologies and techniques to one of the most important and most archaic industries.
- A passion for finding, analyzing, and incorporating the latest research directly into the production environment.
- Good intuition for understanding what good research looks like, and where we should focus effort to maximize outcomes
Bonus if you have experience with:
- Developing and improving core NLP components--not just grabbing things off the shelf or learned from books
- Managing large-scale crowd-sourcing data labelling and acquisition (Amazon Turk, Crowdflower, etc.)
- Developing systems to do or support machine learning, including experience working with NLP toolkits like Stanford CoreNLP, OpenNLP, and/or Python’s NLTK.
- Experience with bioinformatics or statistical analysis tools