Building new pipeline, models and code for on-boarding of a new data source or major change in existing solution.
Look at building using reusable components while implementing new requirements
Regressing performance testing with QA engineer
Routine bug fixes, data queries, job monitoring, fresh data loads, etc.
Creating technical documentation
Reflect on current state and build right automation solutions
Resolving technically advanced problems
Keeping up to date with modern data engineering technologies
May have to crossover to Projects or Support on need basis
Flexible to support after-hours.
5 years with data ingestion and orchestration in Agile with 3 years in data engineering plus a Masters degree in Computer Science or equivalent professional experience combined with a Bachelors degree
Should have minimum 3 years of experience on AWS cloud python with good hand on experience on node.js and Google tech
Extensive experience with Python 3 wsome 2.7 , node.js and Scala
Experience with Apache Spark andor Apache Hadoop
AWS cloud experience EC2, S3, Lambda, EMR, RDS, Redshift
Proficiency in S3 data lake reference architecture
Ability to implement AWS KMS and role-based security
Experience with code version-control tools bitbucket, git
Good experience with agile development processes like Scrum
Insist on building reusable components automating everything
Hands on experience with Google BigQuery data warehouse
Expertise in working on the command line in a UnixLinux environment
Domain experience in mediaentertainment processes
Understanding and awareness about information security.
Nice to have
Experience with AWS Data and Analytics technologies such as Glue, Athena, Spectrum, Data Pipeline
Terraform scripting of resources in AWS
Salesforce DMP Krux data
Familiarity with Ad Sales dataprocesses
Data Science operationalizing
Experience scaling machine learning pipelines and SparkML specifically