Required Technical Expertise
Participate in technical planning & requirements gathering phases including design, coding, testing, troubleshooting, and documenting big data-oriented software applications. Responsible for the ingestion, maintenance, improvement, cleaning, and manipulation of data in the business's operational and analytics databases, and troubleshoots any existent issues.
Implements, troubleshoots, and optimizes distributed solutions based on modern big data technologies like Hive, Hadoop, Spark, Elastic Search, Storm, Kafka, etc. in both an on premise and cloud deployment model to solve large scale processing problems
Experience with Big Data & Analytics solutions Hadoop, Pig, Hive, Spark, Spark SQL Storm, AWS (EMR, Redshift, S3, etc.)/Azure (HDInsight, Data Lake Design) and other technologies
Exposure to MS Azure platform, Healthcare and Analytics technical leadership skills to drive the development team and business in a right direction
Design, enhance and implement ETL/data ingestion platform on the cloud.
Strong Data Warehousing skills, including: Data cleanup, ETL, ELT and handling scalability issues for enterprise level data warehouse
Create ETLs/ELTs to take data from various operational systems and create a unified/enterprise data model for analytics and reporting.
Create and maintain ETL specifications and process documentation to produce the required data deliverables (data profiling, source to target maps, ETL flows).
Strong data modelling/design experience. Experience with data modeling tool (ER/Studio).
Capable of investigating, familiarizing and mastering new data sets quickly
Strong troubleshooting and problem-solving skills in large data environment
Experience with building data platform on cloud (AWS or Azure)
Experience in using Python, Java or any other language to solving data problems
Experience in implementing SDLC best practices and Agile methods
Knowledge on Big Data concepts and technologies like MDM, Hadoop, Data Virtualization, Reference Data/Metadata Management preferred.
Experience in working with Team Foundation Server/JIRA/GitHub and other code management toolsets
Strong hands-on knowledge of/using solutioning languages like: Java, Scala, Python
Healthcare domain knowledge is a plus
Big data-oriented software applications
Hive, Hadoop, Spark, Elastic Search, Storm, Kafka
Pig, Spark SQL Storm, AWS (EMR, Redshift, S3, etc.)/Azure (HDInsight, Data Lake Design)
MS Azure platform, Healthcare and Analytics
ETL/data ingestion platform on the cloud.
Java, Scala, Python
MDM, Hadoop, Data Virtualization, Reference Data/Metadata Management preferred
SDLC best practices and Agile methods
Years of Experience
Bachelor's Degree with a minimum of 8+ year's relevant experience or equivalent.
8+ years of industry experience in data architecture/Big Data/ ETL environment.
4+ years of experience with any ETL design using tools Informatica, Talend, Oracle Data Integrator (ODI), Dell Boomi or equivalent.
2+ years of experience with Big Data & Analytics solutions Hadoop, Pig, Hive, Spark, Spark SQL Storm, AWS (EMR, Redshift, S3, etc.)/Azure (HDInsight, Data Lake Design) and other technologies
2+ years of experience in building and managing hosted big data architecture, toolkit familiarity in: Hadoop with Oozy, Sqoop, Pig, Hive, HBase, Avro, Parquet, Spark, NiFi