Senior Data Scientist

Ambyint

$100K — $130K *
Energy & Utilities
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Strong academic or applied background in Statistics, Mathematics, Engineering, Physics, Operations Research, Econometrics, or another quantitative field.
  • Solid understanding of machine learning fundamentals and strong Python skills (e.g., pandas, NumPy, scikit-learn).
  • Experience with messy real-world datasets, particularly time-series and sensor data.
  • Comfortable working with SQL to extract and manipulate complex data.
  • Proven ability to evaluate models beyond simple accuracy metrics (e.g., residual analysis, cross-validation).
  • Strong communication skills to translate complex statistical outputs for various stakeholders.
  • Willingness to learn advanced modeling approaches like Bayesian workflows.

Responsibilities

  • Develop Bayesian optimization workflows for artificial lift systems.
  • Build probabilistic surrogate models to estimate production response and risk.
  • Design constrained optimization policies considering operational limits.
  • Model intervention-response data with uncertainty-aware metrics.
  • Develop contextual models that factor in real-time operational data.
  • Evaluate model performance and predictive uncertainty across different scenarios.
  • Collaborate with SMEs to translate model outputs into practical recommendations.

Benefits

  • Opportunities for continuous learning and skill development.
  • Engagement in innovative and impactful industrial optimization projects.
  • Collaboration with field operators and engineers for real-world applications.
  • Support for advancing statistical methodologies in practical settings.
Full Job Description
As a Data Scientist in this role, you will support our industrial optimization and decision-modeling work, with an initial focus on artificial lift optimization in oil and gas production. This role is not just about building predictive models; the core challenge is estimating the impact of operating changes from sparse, noisy, and sometimes confounded field data, then using that information to support safe and trustworthy recommendations.

We prioritize candidates with strong mathematical and statistical foundations. We are looking for an applied data scientist who can comfortably reason from first principles about uncertainty, probability, model assumptions, optimization, causal ambiguity, and data quality. The ideal candidate should be able to understand not only how to fit a model, but also why it works, when it fails, and whether its outputs are trustworthy enough to support critical field decisions.

WHAT YOU'LL DO:

  • Develop Bayesian optimization workflows for artificial lift optimization across gas lift, plunger lift, rod lift, and hybrid lift systems.
  • Build probabilistic surrogate models that estimate production response, uncertainty, and risk from sparse and noisy field data.
  • Design constrained optimization policies that account for operational limits, safety constraints, trust regions, and field-approved action ranges.
  • Model intervention-response data using before/after windows, event quality flags, counterfactual baselines, and uncertainty-aware targets.
  • Develop contextual models that condition recommendations on current well state, lift regime, production trends, pressure behavior, and plunger-cycle performance.
  • Evaluate model calibration, predictive uncertainty, out-of-sample generalization, and decision quality across wells and operating regimes.
  • Help design field experiments and sequential learning workflows that balance exploration, exploitation, and operational risk.
  • Build diagnostics for model performance, uncertainty calibration, coverage, residuals by well, response heterogeneity, and support distance.
  • Collaborate with SMEs and operators to translate model outputs into practical recommendations, risk flags, and decision explanations.


QUALIFICATIONS:

  • Strong academic or applied background in Statistics, Mathematics, Engineering, Physics, Operations Research, Econometrics, or another highly quantitative field.
  • Solid understanding of machine learning fundamentals and strong Python programming skills (pandas, NumPy, scikit-learn).
  • Experience working with messy real-world datasets, especially time-series, sensor, operational, or event-based data.
  • Comfort working with SQL or structured data sources to extract and manipulate complex data.
  • Proven experience evaluating models beyond simple accuracy metrics, including residual analysis, cross-validation, subgroup performance, calibration, and error analysis.
  • Ability to reason from first principles about assumptions, noise, uncertainty, bias, and model failure modes.
  • Strong communication skills with the ability to translate complex statistical outputs into practical concepts for both technical and non-technical stakeholders.
  • Willingness and ability to learn new advanced modeling approaches (such as BoTorch/GPyTorch stack and Bayesian workflows).


WHAT SETS YOU APART

  • The gratification of a job well done comes from the satisfaction of your 'customers'-in this case, field operators and engineers trusting your models.
  • You don't just import model APIs; you have a deep curiosity for why a model works, when it fails, and how to prove it's operationally safe.
  • You possess a strong sense of uncertainty awareness and pragmatic judgment around whether a model output is genuinely useful in a physical environment.
  • Continuous learning and improvement are part of your mantra; you are excited to bridge the gap between advanced statistics and real-world industrial machinery.
  • You are curious, creative, biased for action, and love solving problems where data is messy and answers aren't obvious.
  • You have a background or familiarity with time-series forecasting, anomaly detection, causal inference, or estimating the impact of operational interventions.


Similar Jobs

More Jobs at Ambyint

More Energy & Utilities Jobs

Find similar Senior Data Scientist jobs: