5 - 7 years experience • Financial Services
Team player focused on the analysis, development and maintenance of the Data Lake / Data Hub as well as the feed to/from all subscriber applications using the Hadoop ecosystem.
It involves hands-on development and support of integrations with multiple systems and ensuring accuracy and quality of data by implementing business and technical reconciliations.
Development and support will be required for Data Lake / Hub implementation on the Hadoop environment.
- Hands on development and maintenance of the Hadoop Platform and various associated components for data ingestion, transformation and processing
- Hands on development and maintenance of integrations with various bespoke Financial (Applications and Management Information Systems), Underwriting and Claims Management Systems
- Must always exhibit a positive, “can do” attitude
- Supports senior developers, team lead, architects and project staff to deliver high level or certified (detailed) estimates
- Ensure data quality and accuracy by implementing business and technical reconciliations via scripts and data analysis.
- Develop and support RDBMS objects and code for data profiling, extraction, load and updates
- Provide and maintain documentation for all developed objects and processes, in strict timelines.
- Integration testing of deliverables
- Develop and run test scripts to ensure the quality of code and integrity
- Use source code control repository (GITHUB or equivalent for Hadoop ecosystem)
- Following data / integration design patterns and architecture as prescribed by the Architects.
- Supports new initiatives and/or recommendations for database growth and integration
- Minimum of Five or more years working as a productive member of a development team
- Bachelor’s degree in Computer Science or other computer-related field
- Hands on experience on development and support of Data Lakes, using Hadoop Platform
- Experience of working both independently and collaboratively with various teams and global stakeholders (Business Analysts/ Architects/ Support/ Business) in an agile approach while working on projects or data quality issues
- Deep exposure to the Hadoop ecosystem (including HDFS, Spark, Sqoop, Flume, Hive, Impala, MapReduce, Sentry, Navigator)
- Experience on Hadoop data ingestion using ETL tools (e.g. Talend, Pentaho, Informatica) and Hadoop transformation (including MapReduce, Scala)
- Experience in development of complex / enterprise data warehouse implemented over standard RDBMS (preferably using SQL Server Integration Services (SSIS) and SQL Server Objects) preferred
- Experience working on Unix / Linux environment, as well as Windows environment
- Experience on Java or Scala or Python in addition to exposure to Web Application Servers preferred
- Exposure to NoSQL (HBase) preferred
- Experience in creating Low Level Designs preferred
- Prior experience of analysis and resolution of data quality and integration issues
- Experience in providing and maintaining technical documentation, specifically Data Mapping and Low Level / ETL (or ELT) Design Documentation
- Experience on Continuous Integration preferred
- Experience on large scale (>1TB raw) data processing, ETL and Stream processing preferred
- Experience of Database queries (SQL) and performance tuning
- Possess and demonstrate deep knowledge of data warehouse concepts, big data, architecture, techniques, various design alternatives, and overall data warehouse strategies.
- Possess and demonstrate deep knowledge of the Hadoop Ecosystem
- Knowledge of Type 2 Dimension Data model and data warehouse ETL techniques for historical data