Associated Press

Machine Learning Engineer

Associated Press$145K — $180K *
Media
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 8+ years experience building production ML inference systems.
  • Proven expertise in deep-learning inference optimization for NLP/CV models.
  • Skilled in TensorFlow and PyTorch with a focus on inference optimization.
  • Hands-on experience with AWS for optimizing inference pipelines.
  • Knowledge of video tools like FFmpeg and large-scale frame-level inference.
  • Proficient in monitoring model latency, memory, and throughput.
  • Familiar with hybrid search architectures and foundational models.

Responsibilities

  • Design and scale ML inference systems for large data volumes.
  • Optimize and productionize advanced ML models and pipelines.
  • Support hybrid search architectures and partner with engineers.
  • Build scalable data processing pipelines for diverse media.
  • Collaborate with MLOps for deploying reliable ML systems.
  • Optimize inference latency across distributed cloud resources.
  • Create resilient asynchronous processing systems for workloads.

Benefits

  • Competitive medical, dental, and vision coverage.
  • Retirement benefits.
  • Company paid life insurance.
  • Paid vacation and sick days.
  • Paid parental leave for new parents.
  • Mental well-being resources.
Full Job Description
The ML Engineer is a new role within the AP Engineering organization, responsible for shaping how we build and scale machine learning systems at AP, helping to lay the foundation for our machine learning capabilities. The ML Engineer has hands-on experience building and optimizing ML inference systems that run in production environments. This role will develop and tune pipelines that transform millions of photos, videos, and text documents into searchable representations using a combination of deep learning models (e.g., DistilBERT, SBERT, TransNetV2) and external multimodal APIs. The ideal candidate has experience optimizing inference at scale, orchestrating ML workloads, and working with both PyTorch and TensorFlow in a cloud environment, focusing on model performance, integration patterns, and inference efficiency.

This is an individual contributing role who will report directly to our Director of Development, Enterprise Application Services.

What you will do:

  • Design, build, and scale ML-powered inference systems that process large volumes of text, image, and video data to power news-based intelligence products.

  • Productionize and optimize state of the art models and inference pipelines. These models include, but are not limited to:

    • DistilBERT for Named Entity Recognition (NER) over hundreds of thousands of search queries/day

    • TransNetV2 for video shot boundary detection at scale for archival video as well as real-time
    • SBERT for embedding generation from textual descriptions
    • External multimodal APIs for image/video captioning

  • Support hybrid search architectures by defining embedding/re-ranking interfaces, evaluation metrics, and inference performance requirements; partner with search/platform engineers on index configuration, sharding, and cluster tuning.

  • Design and implement scalable data processing pipelines across hybrid CPU/GPU environments to handle millions of media assets.

  • Partner with MLOps and platform engineering to enable the deployment and operation of ML systems reliably, contributing to:

    • Distributed inference architectures

    • Cloud-based execution (e.g., AWS EC2, Batch, Lambda, SageMaker)

    • Efficient resource utilization across workloads

  • Optimize inference latency and throughput across distributed workloads using cloud-based resources (AWS EC2, Batch, Lambda, SageMaker, etc.)


  • Build resilient asynchronous processing systems for large-scale workloads, ensuring:

    • Reliability (retries, fault tolerance)

    • Efficiency (caching, deduplication)

    • Observability (metrics, logging, traceability)

  • Work closely with data scientists and product teams to iterate on models, improve performance, and deliver measurable impact in production.


Who you are:

  • 8+ years of experience building production ML inference systems.

  • Demonstrated ownership of deep-learning inference optimization in production (quantization, distillation, compilation, kernel/profile-level performance work) for transformer NLP and/or CV models.

  • Experience with both TensorFlow (SavedModel, tf.data, XLA, TFLite) and PyTorch (TorchScript, ONNX, FastAPI/TorchServe)

  • Hands-on experience optimizing inference pipelines on AWS infrastructure, ideally across different types of media assets.

  • Experience with video frameworks/tools (e.g., FFmpeg), and working with large-scale frame-level inference.

  • Demonstrated experience monitoring and debugging model latency, memory, and pipeline throughput.

  • Experience with hybrid search architectures (BM25 + vector search + cross-encoder reranking).

  • Familiarity with OpenAI APIs or other foundation model providers.

  • Familiarity with open source HuggingFace LLMs.

  • Experience with data pipeline and workflow orchestration tools (e.g., Airflow)


Who This Role is Not For:

Candidates whose primary background is MLOps platform work (e.g., DAG orchestration, Terraform, Kubernetes administration, generic CI/CD pipelines) will not be a fit. We are looking for a senior level engineer who has experience profiling a transformer, rewriting its serving path for a 2-3x latency reduction, tuning an HNSW index, and can tell us which SageMaker instance type will hit our p95 target at the lowest cost.

Salary & Benefits:

The anticipated salary range for this position is $145,000 - $180,000 based on a candidate's skills, qualifications and location. The Associated Press offers comprehensive benefits, which include:

  • Competitive medical, dental and vision coverage

  • Retirement benefits

  • Company paid life insurance

  • Paid vacation and sick days

  • Paid parental leave for any new parent

  • Mental well-being resources

About Associated Press

The Associated Press (AP) is a nonprofit news organization that provides news coverage to media outlets around the world. The company was founded in 1846 and has a long history of providing accurate and unbiased news coverage. The AP's journalists are located in more than 100 countries and cover a wide range of topics, including politics, business, sports, and entertainment. The company is committed to the highest standards of journalism and has won numerous awards for its reporting. The Associated Press is headquartered in New York City.
Learn more about Associated Press
Size
3,700 employees
Industry
Founded
1846

Similar Jobs

More Jobs at Associated Press

More Media Jobs

Find similar Machine Learning Engineer jobs: