Data Engineer II

Condé Nast   •  

New York, NY

Industry: Media

  •  

Less than 5 years

Posted 137 days ago

This job is no longer available.

Data Engineer

The Data Services team is seeking a Data Engineer with a passion for creating data products to help create more engaging, personalized experiences for users across Conde Nast’s properties. You will have the opportunity to build both product features and internal data tools, all while working with a diverse group of datasets - web events, ad streams, content and context models, etc. You will also get to work with the newest data technologies available. Above all, you will influence how users interact with Conde Nast’s industry-leading journalism.

Primary Responsibilities

* Develop and maintain scalable data pipelines, with a focus on writing clean, fault-tolerant code
* Maintain various data stores and distributed systems, such as Hive and Presto
* Optimize data structures for efficient querying of those systems
* Collaborate with internal and external data sources to ensure integrations are accurate, scalable and maintainable
* Collaborate with data science team on implementing machine learning algorithms to facilitate audience intelligence and cross-brand personalization initiatives
* Collaborate with business intelligence/analytics teams on data mart optimizations, query tuning and database designs
* Execute proof of concepts to assess strategic opportunities and future data extraction and integration capabilities
* Define data models, publish metadata, and best practice querying standards

Required Skills

* 2+ years data engineering and/or software development experience, preferably with experience using a scripting language such as Python or Java
* Fluency in SQL (any variant)
* Experience with Hadoop and related technologies (Hive, Presto, Spark)
* Exceptional analytical, quantitative, problem-solving, and critical thinking skills
* Have a collaborative work style with strong desire to work in dynamic, fast paced environment that requires flexibility and ability to manage multiple priorities

Desirable Skills

* Experience with workflow / ETL tools and schedulers; e.g. Luigi, Airfow
* Experience with AWS tools; especially EMR, S3, Lambda
* Experience with GCP tools; e.g. BigQuery, DataFlow, PubSub
* Experience with Apache Beam

11322