Data Engineer

McKesson   •  

Phoenix, AZ

Industry: Healthcare


8 - 10 years

Posted 24 days ago

The scope, breadth, and depth of data assets across the global enterprise at McKesson is unparalleled. Together with a leading technology stack, MT Insights is equipped to drive valuable insights to support McKesson’s ongoing growth.

Job Summary:

The Data Engineer position will be based in our Scottsdale, AZ office. As part of the McKesson Data & Analytics team, this individual will provide in-depth advanced analytical insights while working in a collaborative environment to solve business problems. By providing critical analytics support, this person will work with Technologists, Data Scientists and Data Engineers to develop insights and analytic models to help key stakeholders make better informed decisions.

Job Description:

The Data Engineer will perform the following:

  • Project Support
    • Support a cross-functional team and provide in-depth data insights for complex business problems that can be approached with advanced analytics
    • Work closely with the other Data Engineers and Scientists to deliver results
    • Coordinate and collaborate with program managers and other internal stakeholders including gathering requirements and modeling criteria
    • Leverage tools and resources to plan, evaluate and execute strategic initiatives
    • Strong delegation and management skills to ensure work is conducted in a timely, quality fashion by internal and external partners
    • Continuously improve processes and operations
  • Advanced Analytics
  • Build scalable and reliable data engineering solutions for moving data efficiently across systems from various internal and external data sources in the batch and real-time mode
  • Analyze, model structured data and implement/scale algorithms to support analysis using advanced statistical and mathematical methods from statistics, data mining, econometrics, and operations research Using distributed and parallel programming techniques.
  • Troubleshooting, Monitoring and Performance Tuning of various software components of various data science analytical solutions, Restful web services and etc...
  • Rapidly perform exploratory data analysis. Generate and test working hypotheses, and uncover interesting trends and relationships
  • Perform advanced analytics techniques to mine unstructured data, using methods such as document clustering, topic analysis, named entity recognition, and document classification (machine learning, and natural language processing is a plus)
  • Understanding of large database mining tools and statistical languages utilized to efficiently build approaches and execute on analytical use cases. Specific experience with Splunk, R, Python, Hive, Pig, Spark, SAS, Hadoop (MapReduce, HBASE), Java, Trifacta, H2O, Tableau and SQL is preferred. Experience with SAP and IBM DataStage is also beneficial.
  • Communicate results and educate others through reports and presentations
  • Requirements / Critical Skills:
  • Bachelor’s degree with a minimum 8+ years in software engineering and/or software architecture. MS with a minimum of 5-6 years of related experience; Degree in Computer Science, Statistics, Mathematics, Engineering, Econometrics, or related fields
  • Extensive experience (3+ years) with designing and implementing Data Lakes or similar solutions both on-premise/on-cloud (AZURE/GCP).
  • 3+ years of Experience with Spark, Hive, Kafka, Pig, Zookeper, Oozie, Flume, and Streaming and similar distributed computing technologies. Proficiency in analysis (e.g. Python, R, SAS) packages
  • Strong quantitative analysis background and experience in working with large, complex data systems to aggregate, organize, and prepare data for use in business analysis.
  • Strong mathematical background with strong knowledge in at least one of the following fields: statistics, data mining, statistics, operations research, econometrics, and/or information retrieval (image processing (OpenCV) and natural language processing is a plus).
  • Solid Understanding/Experience in extracting, cleaning, preparing and modeling data. Experience with ETL tools such as IBM IIS, Azure Data Factory.
  • Experience with command-line scripting, data structures, and algorithms along with parallel programming techniques; ability to work in a Linux environment is a plus.
  • Data visualization and/or power point presentation skills to effectively communicate insights is a plus.
  • Proficiency in database languages (e.g. Hive, SQL) and programming languages (e.g. Python, Scala (or Java), shell scripting.). Web application development frameworks like Spring, Spring Boot (for micro-services).
  • Up to 10-15% travel may be required (domestic and international)

Additional Knowledge / Skills:

  • Understanding of the agile software development methodology (e.g. Scrum, Kanban)
  • Exposure to Unix/Linux
  • Strong experience with the distributed cluster management. Understanding of Docker/ Kubernetes container orchestration technologies.
  • Able to use sound judgment and prioritize
  • Must be flexible, team oriented and able to accept and create change
  • Enjoys working with numbers and data sets
  • Fast and capable learner. Results oriented, enthusiastic, creative thinker.