TracFone is seeking an individual who will be responsible to design, develop and maintain applications with Spark/Scala and Python languages for constructing and optimizing our Data Lake and data pipeline architecture.
S/he is expected to be experienced as a data pipeline builder and a data wrangler who enjoys optimizing data systems and building them from the ground up. S/he should be able to monetize the data by experimenting with various Machine Learning algorithms to create prototypes for predictive models and product-ionize those prototypes. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products.
- Build data streams using Scala/Python to inject, load, transform, group, logically join and assemble data ready for data analysis / analytics/ reporting / next best action / next-best-offer.
- Pipeline data using Cloud or on-premises technologies: AWS, Google, Big Data, Hadoop, AI/Deep Learning API's, SQL/NOSQL, Unstructured databases.
- Build predictive models via Apache Spark Machine Learning Libraries by mining data from a data lake or other sources utilizing Spark/ Scala and other interpretive languages such as Python.
- A Bachelor's Degree from an accredited college in Computer Science or equivalent.
- Must have 5+ years of experience in Java developing
- Must have 3+ years of experience in developing applications in Big Data platform (AWS).
- Must have 3 + years of experience in Scala, Python and Pyspark
- Experience in building various Machine Learning models such as recommendation engines or automated Customer scoring systems using Apache Spark Machine Learning Libraries is preferred
- Strong knowledge of AWS Cloud Systems, AWS data related services; DMS, Glue, Athena, Lambda, Redshift, Dynamo DB, Sage Maker)
- Strong knowledge of Scala, Java, Hadoop, Kafka, Python and Pyspark
- Strong knowledge of Unix/AIX and Windows operating systems, standard concepts, practices, and procedures within the relational database field.