Ampcus inc

MLOps Platform Engineer (SageMaker)

Ampcus inc$120K — $150K *
Plano, TX 75025In-Person
Enterprise Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10-15 years of software engineering experience in cloud infrastructure or ML platform operations
  • 5 years of hands-on experience with AWS, particularly Amazon SageMaker
  • 3 years of experience in building and operating production MLOps pipelines
  • Proficiency in Infrastructure-as-Code tools like Terraform or CDK
  • Experience with IAM design specifically for ML platforms
  • Familiarity with MLflow or equivalent tracking systems
  • Knowledge of SageMaker workflow orchestration tools, such as Pipelines or Airflow.
  • Experience with Snowflake as a data source for ML pipelines

Responsibilities

  • Set up and configure the SageMaker Unified Studio platform across multiple environments.
  • Build and maintain MLOps pipelines using SageMaker, including data handling and model training.
  • Manage the SageMaker Model Registry for model versioning and lineage tracking.
  • Implement MLflow for experiment tracking across models and parameters.
  • Establish identity and access management for secure and efficient platform access.
  • Develop model serving capabilities, including real-time endpoints and batch predictions.
  • Set up monitoring for model performance and data drift to ensure accuracy and reliability.

Benefits

  • Collaborative working environment with a strong focus on innovation
  • Opportunities for professional development and training
  • Access to cutting-edge cloud technologies
  • Flexible working arrangements to support work-life balance
  • Participation in team-building and company events
Full Job Description
Job Title: MLOps Platform Engineer (SageMaker)

Location(s):Onsite

Job Summary:

This position is with the Enterprise Analytical Data & Integration Team. The ideal candidate will have extensive experience in cloud infrastructure or ML platform operations, with a specific focus on AWS and Amazon SageMaker. The role involves designing, building, and operationalizing an enterprise ML platform on AWS SageMaker Unified Studio.

Key Responsibilities:
  • Set up SageMaker Unified Studio platform - domain configuration, project provisioning, persona-based roles, and multi-environment promotion workflows.
  • Build MLOps pipelines using SageMaker Pipelines - data extraction from Snowflake, preprocessing, training, evaluation, and model registration.
  • Manage SageMaker Model Registry - cross-account model promotion, versioning, immutability, and lineage tracking.
  • Configure MLflow experiment tracking - auto-logging of parameters, metrics, and artifacts.
  • Set up identity and access management - Okta SSO, SailPoint entitlements, persona-based execution roles, service roles for pipelines.
  • Build model serving - real-time SageMaker endpoints and batch prediction workflows.
  • Set up model monitoring - data drift, model drift, performance degradation detection.
  • Configure data catalog - searchable datasets, access-level visibility, access-request workflows, lineage.
  • Own platform operations - observability (CloudWatch, Datadog), logging, custom images, instance availability.

Required Qualifications:
  • 10-15 years of software engineering experience focused on cloud infrastructure or ML platform operations.
  • 5 years hands-on with AWS, including deep expertise in Amazon SageMaker (Studio, Pipelines, Model Registry, Endpoints, Feature Store).
  • 3 years building and operating production MLOps pipelines - training, versioning, deployment, monitoring, rollback.
  • Experience with SageMaker Unified Studio or Studio Classic - domain/project setup, blueprints, multi-tenant configuration.
  • Infrastructure-as-Code with Terraform, CDK, or CloudFormation.
  • IAM design for ML platforms - execution roles, service roles, cross-account access, Lake Formation, SSO/SAML.
  • MLflow or equivalent experiment tracking.
  • SageMaker Pipelines or similar workflow orchestration (Airflow, Step Functions).
  • Model serving - real-time endpoints, batch transform, auto-scaling, endpoint monitoring.
  • Snowflake as a data source for ML pipelines.
  • Kubernetes (EKS) and container orchestration.
  • Networking and security - VPC, security groups, private endpoints, cross-account connectivity.

Similar Jobs

More Jobs at Ampcus inc

More Enterprise Technology Jobs

Find similar MLOps Platform Engineer (SageMaker) jobs: