Perficient

Lead SageMaker Platform Engineer

Perficient$73K — $170K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • Expert-level AWS operational experience with debugging via logs and telemetry.
  • Deep IAM and permissions expertise in a multi-account AWS setup.
  • Hands-on experience with SageMaker, particularly in managing pipelines.
  • Experience with multi-account AWS workflows aligned to an SDLC.
  • Ability to work in a collaborative, hands-on environment under pressure.
  • Strong communication skills for articulating issues and solutions.

Responsibilities

  • Pair with data scientists to debug and fix SageMaker pipelines.
  • Triage failures using AWS logs and telemetry to find root causes.
  • Resolve permissions issues across execution roles and CI/CD processes.
  • Debug cross-account model artifact syncing and validation flows.
  • Enhance team understanding of platform dynamics and troubleshooting.

Benefits

  • Flexible working hours.
  • Opportunities for career progression.
  • Access to professional development resources.
  • Collaborative and innovative team culture.
  • Generous leave policy.
Full Job Description
Job Description

Job Overview:

Our team is undergoing a large data + ML migration onto AWS SageMaker Pipelines. We deploy via Terraform and GitHub Actions across multiple AWS accounts aligned to our SDLC, sync model artifacts to a shared-services account, and validate models in dedicated testing accounts. Data is sourced primarily from Redshift, including trusted identity propagation.

We're standing these pipelines up for the first time, and we need an expert who can help us debug and ship them to production quickly and reliably.

Responsibilities

  • Pair with our data scientists in live debugging sessions to diagnose and fix broken SageMaker pipelines and get them through the SDLC to prod.
  • Rapidly triage failures using AWS logs and telemetry (CloudWatch, CloudTrail, SageMaker pipeline/execution logs, etc.) and pinpoint root causes.
  • Untangle permissions issues across pipeline execution roles, cross-account access, and CI/CD identity (GitHub Actions OIDC, Terraform-managed IAM).
  • Help debug cross-account model artifact syncing (shared services) and the testing-account validation flow.
  • Level up the team's mental model for how the platform works and where to look when things break.


Qualifications

  • Expert-level AWS operational experience, especially debugging via logs and telemetry (CloudWatch Logs/Metrics, CloudTrail, X-Ray or equivalent) - can move from a vague failure to a root cause fast.
  • Deep IAM / permissions expertise in a multi-account setup: execution roles, assume-role/cross-account access, resource policies, KMS/encryption permissions, and reasoning about "who is allowed to do what, as which principal."
  • Hands-on SageMaker experience, including SageMaker Studio and SageMaker Pipelines - knows how pipelines are defined, deployed, and executed, and where to look when a step fails. (Operating/debugging, not modeling.)
  • Multi-account AWS experience aligned to an SDLC (dev/test/prod), including cross-account resource sharing and promotion patterns.
  • Comfortable working embedded and hands-on: live pairing, screen-sharing, and debugging under time pressure.
  • Strong communicator who can explain why something broke and how to avoid it next time.


Nice to Haves:
  • Terraform experience, especially managing IAM and SageMaker/data infrastructure as code.
  • GitHub Actions CI/CD experience, particularly OIDC-based authentication to AWS (no long-lived keys) and the IAM trust policies behind it.
  • Experience with Amazon Redshift, and ideally trusted identity propagation / IAM Identity Center integration.
  • Some ML/MLOps background - enough to speak the language of model training, artifacts, and deployment (helpful, not required).
  • AWS certifications (e.g., Solutions Architect Pro, DevOps Engineer Pro, ML Specialty) as a signal, though hands-on evidence matters more.


The salary range for this position takes into consideration a variety of factors, including but not limited to skill sets, level of experience, applicable office location, training, licensure and certifications, and other business and organizational needs. The new hire salary range displays the minimum and maximum salary targets for this position across all US locations, and the range has not been adjusted for any specific state differentials. It is not typical for a candidate to be hired at or near the top of the range for their role, and compensation decisions are dependent on the unique facts and circumstances regarding each candidate. A reasonable estimate of the current salary range for this position is $73,008 to $170,640. Please note that the salary range posted reflects the base salary only and does not include benefits or any potential variable compensation programs. Information regarding the benefits available for this position are in our benefits overview.

Disclaimer: The above statements are not intended to be a complete statement of job content, rather to act as a guide to the essential functions performed by the employee assigned to this classification. Management retains the discretion to add or change the duties of the position at any time.

#LI-RS1

About Perficient

Perficient is a leading digital consultancy that helps companies transform their businesses and operations through technology. They deliver solutions to clients that range from Fortune 500 companies to emerging businesses. Perficient has a broad range of capabilities, including strategy, design, technology, and operations. They have expertise in a variety of industries, including healthcare, financial services, retail, and energy. Perficient has been recognized as a top employer and a top company for women technologists. They are committed to giving back to their communities through philanthropy and volunteerism.
Learn more about Perficient
Size
6,079 employees
Market Cap
$2.4 billion
Industry
Net Income
$30.1 million
Founded
1998
5 Year Trend
+9.3%
Revenue
$612.1 million
NASDAQ

Similar Jobs

More Jobs at Perficient

More Information Technology Jobs

Find similar Lead SageMaker Platform Engineer jobs: