Data Engineer, Baseball Operations

MLB (Major League Baseball)   •  

Bronx, NY

Industry: Hospitality & Recreation

  •  

Less than 5 years

Posted 34 days ago

Description:

The New York Yankees Baseball Operations department is accepting applications for an experienced data engineer with a focus on data quality analysis.  This position reports to our senior Baseball Operations executives and will assist in the development and maintenance of our data processing pipelines. 

Primary Responsibilities:

  • Prepare, clean, format analytical datasets for processing by data scientists
  • Become an expert in our datasets, their strengths and weaknesses and write code to pull and verify data in response to data scientist requests
  • Using R, visualize complex, multi-source data to pinpoint data quality issues
  • Build automated pipelines for processing and cleaning data
  • Conduct database feature engineering to support ongoing quantitative research
  • Work with developers to create and deploy systems for anomaly detection
  • Interface with data scientists, software developers, and other baseball operations staff as needed
  • Design department-wide principles and workflow for data quality management
  • Serve as the main point-of-contact for questions about data structures, definitions, and quality

 Qualifications and Experience: 

  • Bachelor’s degree in Computer Science or related field
  • 3+ years of experience developing in SQL (preferably T-SQL)
  • 2+ years of experience with data profiling, data modeling, and data pipeline development
  • 2+ years of experience developing in R (or a similar statistical programming language), including experience with data manipulation and visualization in that language
  • Ability to write succinct code with optimal performance and simplicity
  • Excellent communication and problem-solving skills – must be able to break down a complex task and put together an execution strategy with little guidance
  • An understanding of typical baseball data structures, basic and advanced baseball metrics, and knowledge of current baseball research areas