Machine Learning Research Engineer

Nuance Labs

$120K — $150K *
Consumer Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience deploying ML models in production environments
  • Proficient in PyTorch and contemporary ML architectures
  • Experience with GPU systems and CUDA debugging
  • Proven track record in building scalable data pipelines
  • Familiarity with model compression techniques

Responsibilities

  • Collaborate with researchers to transition experimental models to production systems
  • Enhance model inference performance through various optimization techniques
  • Leverage optimization tools to accelerate diverse multimodal models
  • Design and implement large-scale data ingestion and training pipelines
  • Create frameworks for evaluating model quality and guiding improvements

Benefits

  • In-person collaboration, fostering teamwork and innovation in Seattle HQ
Full Job Description
Responsibilities

  • Operationalize Research: Collaborate with researchers to move models from experimental checkpoints to production-ready systems. Establish patterns for large-scale training, rapid experimentation, and deployment of new architectures.


  • Optimize Model Performance: Profile and improve model inference for latency and throughput using quantization, pruning, distillation, and architectural refinements to ensure viable unit economics


  • Model Acceleration: Apply optimization techniques (TensorRT, ONNX, vLLM) to accelerate multimodal models including video diffusion, LLMs, and speech models


  • Design Data Pipelines: Design and implement efficient pipelines for video data ingestion, preprocessing, and training at petabyte scale using tools like Dagster and Ray.


  • Evaluate and Iterate: Build evaluation frameworks to measure model quality, establish benchmarks, and guide continuous improvement of model capabilities.


Requirements

  • Production ML: Experience deploying ML models to production. You understand common failure modes and how to address them (resource contention, OOMs, batch optimization)
  • Deep Learning Experience: Strong knowledge of PyTorch and modern ML architectures. Experience training and optimizing large models (transformers, diffusion models, or similar).


  • Systems Proficiency: Comfortable working with GPUs, debugging CUDA issues, and profiling model workloads to identify compute or memory bottlenecks.


  • Data Engineering: Experience building scalable data pipelines for high-bandwidth media processing and training workflows.


Preferred Experience

  • Experience with video or audio models in research or production settings


  • Familiarity with low-level optimization (CUDA kernels, Triton, custom operators)


  • Knowledge of real-time ML systems and latency-critical inference


  • Prior work with model compression techniques (quantization, distillation, pruning)


Nuance Labs Key Facts

  • In-person collaboration, 5 days a week at Seattle HQ

Similar Jobs

More Jobs at Nuance Labs

More Consumer Technology Jobs

Find similar Machine Learning Research Engineer jobs: