As Data Engineer in Information Delivery Group will be responsible for all aspects of the data delivery, primarily developing large scale data pipelines, cloud based big data solutions supporting structured and unstructured data, including streaming. You will partner with the Data Science and Analytics teams to influence ways to achieve business priorities and provide leadership through innovation, critical thinking, collaboration and proactive risk management.
- Collaborates with business partners as a hands-on technologist, responsible for leading end-to-end data solution delivery to solve complex data problems
- Develops and displays thought leadership through a deep understanding of supported business capabilities, and applies industry trends/products/solutions to solving underlying business problems
- Performs data analysis on very complex datasets that span multiple source systems, develops a deep understanding of complex relationships, and serves as a subject matter expert
- Leads the analysis and design of complex data solutions, across all layers of the data platform, including data models/storage, pipelines, data science models, and visualization
- Leverages and promotes existing frameworks and influences the development of new data framework and patterns. Facilities and provides and receives code reviews, taking ownership of outcomes.
- Drives supportability and the quality of the overall solution, while also adhering to functional and non-functional requirements, ensuring solutions are automated and tested early and often
- Collaborates with business, technology, and vendor partners to identify solution options, analyze pros and cons, make recommendations, and influence final decisions while tying multiple products/services together holistically
- Proactively keeps up to date on data-related technologies that could drive positive change, and seeks opportunities to share that knowledge with others
- Plans, estimates, and proposes work break down structure for the entire solution, including innovative approaches for managing the work, solution, and/or risks. Accountable for scope, timeline, and the costs of the proposed solution
- Proactively identifies and communicates issues, risks, progress to business and IT stakeholders, providing options and next steps
- Troubleshoots and supports the entire solution, across all layers of the data platform
- BS in Computer Science or Technical field
- 3+ years of relevant work experience in analytics, data engineering, complex ETL, BI or related field
- 2+ years of experience in implementing big data solutions and automating enterprise scale data pipelines leveraging Hadoop, Apache Spark, etc.
- Code proficiency in Python and other modern programming languages
- Experience writing and optimizing advanced SQL queries in a business environment with large-scale, complex datasets.
- Expert knowledge of large scale / distributed processing database systems and big data platforms, like Hadoop, AWS Redshift, Microsoft MPP, Snowflake
- Excellent understanding of Big Data stack, including Hive, Hbase, Oziee, Airflow, MapReduce
- Expert technical knowledge of ETL/ ELT technologies and a proven implementation track record leveraging complex SQL, ETL tools, API/file-based end points
- Strong experience with cloud-first design, preferably AWS or Azure and corresponding services and components (data lake, mpp databases, autoscaling, container orchestration, etc.)
- Experience in troubleshooting infrastructure bottlenecks at the storage, network, and/or compute layers
- Strength in leadership, interpersonal, and problem solving skills with the ability to continually learn new concepts and technologies and effectively apply them