Data Engineer

Shutterstock   •  

Montreal, QC

Industry: Imaging & Photography


5 - 7 years

Posted 396 days ago

Job Summary

The Data Infrastructure Team at Shutterstock is responsible for our data transport pipeline (clickstream, events, metrics, etc), the long term storage and availability of that data, and providing access to insights. We're looking for an exceptional engineer with a passion of enabling data for consumption across our organization. We believe in providing access that encourage data interaction, exploration, and analysis.



  • Build scalable applications that collect high volume data
  • Built applications and tools that enable analysis of data
  • Develop and improve tools to load and process our raw data
  • Help manage and scale our real-time data processing and analytics systems
  • Work closely and collaboratively in an Agile environment with our developers and product teams to analyze issues and find new insights into our business and operations.

Basic Qualifications:

  • Base understanding of linux systems operations
  • Experience with Hadoop ecosystem tools: HDFS/S3, Hive, Map/Reduce, Yarn, Pig, Sqoop, Oozie
  • 5+ years experience developing software in one of the following languages: Python, Java/Scala (or other JVM language)
  • Experience with a stream processing technology (Spark, Storm, Samza, Flink, etc)
  • Understanding of structured and unstructured data design/modeling
  • Understanding of columnar data storage formats
  • Experience with configuration management tools [puppet, chef, etc]
  • Experience or exposure to terraform, kubernetes, docker, etc

Bonus Qualifications:

  • Experience with ETL process / software [Pentaho, Scalding, Cascading, Luigi, Airflow, etc]
  • Experience with Cassandra, Vertica
  • Experience with Kafka, Kinesis, Druid
  • Experience in predictive analytics
  • Open source contributions