Data Engineer - Classical Statistics & Machine Learning

BLN24

$90K — $130K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Bachelor's in Data Science, Statistics, Computer Science, Engineering, or related field.
  • 3-5 years of combined experience in data engineering and data science/statistical analysis.
  • Proficient in Python with experience in data engineering and statistical/ML libraries.
  • Hands-on experience in building and maintaining ETL/ELT pipelines.
  • Strong knowledge of classical statistical methods and machine learning techniques.
  • Experience with SQL and both relational and distributed data systems.
  • Familiar with federal data environment and governance requirements.

Responsibilities

  • Design, build, and maintain ETL/ELT pipelines for data ingestion.
  • Develop data workflows for batch and near-real-time sources.
  • Implement data validation and transformation logic for quality assurance.
  • Optimize data pipeline performance in a modern lakehouse architecture.
  • Build and maintain data models that support analytics and reporting.
  • Monitor pipeline health and troubleshoot data quality issues.
  • Document data lineage and pipeline architecture for governance.

Benefits

  • Comprehensive medical, dental, and vision plans.
  • Opportunity to work with a rapidly growing company in the DC Metro area.
  • Flexibility for remote work based on project needs to support work-life balance.
Full Job Description
Job Title: Data Engineer

Company: BLN24

Position Overview:
BLN24 is seeking a mid-level Data Engineer to support a large-scale data and analytics platform modernization effort for a federal statistical agency client. This is a hybrid role: data engineering (building and maintaining the pipelines that bring data into the platform) and applied data science (using classical statistics and machine learning to analyze that data once it's available).

The ideal candidate is equally comfortable writing production-grade ingestion and
transformation code as they are designing and validating a statistical or ML model.
This role works closely with SMEs across multiple program areas to understand source data, build reliable ETL/ingestion pipelines, and apply analytical methods - anomaly detection, statistical modeling, and machine learning - to support operational decision-making.

Key Responsibilities:

Data Engineering
  • Design, build, and maintain ETL/ELT pipelines to ingest data from multiple source systems into the platform's central data store
  • Develop and maintain data ingestion workflows for both batch and near-real-time sources
  • Implement data validation, cleaning, and transformation logic to ensure data quality and consistency across pipelines
  • Work within a modern lakehouse/cloud data architecture, optimizing pipeline performance and reliability
  • Build and maintain data models and schemas that support downstream analytics and reporting needs
  • Monitor pipeline health, troubleshoot failures, and implement logging/alerting for data quality issues
  • Document data lineage, transformation logic, and pipeline architecture for governance and reproducibility


Data Science / Statistics & ML
  • Apply classical statistical methods (hypothesis testing, regression, time-series analysis, distributional comparisons) to identify trends, anomalies, and outliers in operational data
  • Design and implement benchmarking approaches that compare production data against historical, modeled, or external reference values
  • Develop and evaluate machine learning models where appropriate, balancing predictive performance with interpretability for non-technical stakeholders
  • Investigate flagged anomalies by digging into underlying data to identify root causes and contributing factors
  • Work with SMEs to translate operational questions into analytical approaches, and clearly communicate statistical/ML findings and their limitations
  • Account for data sensitivity classifications and governance requirements when designing analyses and models
  • Collaborate with visualization-focused team members to ensure outputs of statistical/ML work are presented clearly to stakeholders


Required Qualifications:
  • Bachelor's degree in Data Science, Statistics, Computer Science, Engineering, or related field (or equivalent experience)
  • 3-5 years of experience spanning both data engineering and data science/statistical analysis
  • Strong proficiency in Python, including experience with data engineering libraries (e.g., pandas, PySpark) and statistical/ML libraries (e.g., scikit-learn, statsmodels)
  • Hands-on experience building and maintaining ETL/ELT pipelines, including ingestion, transformation, and validation logic
  • Solid grounding in classical statistical methods (hypothesis testing, regression, distributional analysis) and practical machine learning techniques
  • Experience working with SQL and relational/distributed data systems
  • Ability to work within a federal data environment, including familiarity with data sensitivity tiers and access/disclosure constraints
  • Strong communication skills, with the ability to explain technical/statistical concepts to non-technical stakeholders

Preferred Qualifications:
  • Prior experience supporting federal statistical agencies or other federal data programs
  • Familiarity with Databricks or modern lakehouse architectures (Spark, Delta Lake, etc.)
  • Experience with workflow orchestration tools (e.g., Airflow, Databricks Workflows)
  • Experience designing anomaly-detection or outlier-detection approaches beyond standard threshold-based methods
  • Exposure to disclosure avoidance concepts or working with regulated/protected government data
  • Experience working across multiple coding environments (Python, R, SAS) within the same analytics platform
  • Background in requirements gathering or systems design for enterprise data platforms


Work Environment:
  • Contract position supporting a federal agency data modernization engagement
  • Collaborative, cross-functional environment working alongside data engineers, data scientists, architects, and program SMEs
  • Requires U.S. citizenship and ability to obtain a public trust or other clearance/suitability determination typical of federal contractor engagements


What BLN24 brings to the Game:
BLN24 benefits are game changing. We like our team to play hard and that means they need to be taken care of - physically, financially, and emotionally. We make sure to keep them in the game by giving them access to generous medical, dental, and vision plans.
  • You can join one of the fastest growing companies headquartered in the Washington DC Metro Area. We give you the opportunity to work in different sectors, so you have the chance at variety while maintaining stability.
  • Flexibility at BLN24 allows each individual the opportunity to balance quality work and their personal lives. Depending on projects, we allow remote working opportunities so you can always be in the game no matter where you call home.

Similar Jobs

More Jobs at BLN24

More Information Technology Jobs

Find similar Data Engineer - Classical Statistics & Machine Learning jobs: