Shutterstock is looking for an experienced data engineers! Our team’s mission is to push the boundaries of multimedia search, enabling customers to find content faster and easier. You will be working on improving our core search technology. Along the way you will be collaborating with an extremely talented and passionate team of researchers, data scientists and engineers. You will be helping wrangle large amounts of data to help ensure the successful launch of innovative new customer-facing features.
- You are excited about taking on ownership of important projects as well as working closely with the rest of the team to ensure successful completion of work.
- You will be designing and coding distributed and scalable data pipelines and ETLs.
- You will help extract value and understanding from large volumes of data from multiple sources
- You will work with engineers, data scientists and researchers to drive ideas from the rapid prototyping phase all the way through to serving live traffic at scale.
- Passionate about writing and maintaining high-quality software that solves real-world meaningful problems at scale
- Ability to write efficient queries against multiple types of data stores (both SQL and noSQL)
- Experience architecting, building, and maintaining data pipelines and ETLs
- Strong engineering discipline and ability to write clean code in python or Java
- Knowledge of the Hadoop ecosystem and related technologies (mapreduce, Pig, HBase, Hive)
- Familiarity with large scale data processing (eg Apache Spark)
- Prior work with streaming platforms (eg Apache Kafka)
- Prior experience with common data serialization formats, file formats, compression, etc (Apache Parquet, Avro, Thrift, ProtocolBuffers, Orc, etc)
- Passionate about helping your teammates grow, and continually expanding the team’s tech knowledge
- Experience delving into and solving complex issues with simple solutions
- BS or MS in Computer Science or equivalent experience (PhD a plus)
- 3+ years of development experience
- AWS, Docker, Kubernetes experience
- Familiarity with tools for working with data in AWS such as Amazon Redshift, Amazon Athena or related
- Experience with ETL workflow frameworks (Airflow, Luigi, Pinball)