CarParts.com

Staff AI Platform Engineer

CarParts.com$120K — $160K *
Information Technology
8 - 10 years of experience
Job Overview by Ladders

Qualifications

  • 10+ years of experience in DevOps, SRE, or platform engineering in AWS environments.
  • Expert in AWS services including EKS, EC2, SQS, and CloudWatch.
  • Strong Kubernetes cluster operations experience with skills in workload isolation and auto-scaling.
  • Familiarity with enterprise CDN tools like Akamai for traffic management.
  • Hands-on experience with CI/CD pipelines using GitHub Actions and Jenkins.
  • Production experience in developing or operating AI agents and integrating LLMs.
  • Proficient in Node.js and/or Python for developing automation tools and MCP servers.
  • Experience in observability stack management including Elastic/Kibana for monitoring and alerting.

Responsibilities

  • Oversee the operational management of AWS multi-account infrastructure including EKS and EC2.
  • Manage Kubernetes clusters and optimize containerization across environments.
  • Design and implement CI/CD processes focused on micro-frontend applications.
  • Configure and maintain Akamai CDN for effective traffic management and performance optimization.
  • Extend AI capabilities through managing and enhancing the OpsWhisperer platform.
  • Deploy and enhance autonomous agents for monitoring and operational tasks.
  • Lead incident response activities with a focus on proactive, AI-assisted solutions.

Benefits

  • Opportunity to work with pioneering AI technologies and autonomous systems.
  • Collaboration with engineering leadership in a mature platform environment.
  • On-site role to foster team synergy and direct communication.
  • Access to advanced tools and a fully containerized AWS estate for impactful work.
Full Job Description
THE OPPORTUNITY

One exceptional engineer. AI as the team.

This is not a standard DevOps posting. We are looking for one unusually capable, AI-native engineer to own our entire platform engineering and SRE function - using autonomous agents, LLM-powered pipelines, and MCP-based tooling as force multipliers to do the work of a team, on-site, in close partnership with our engineering leadership.

You will inherit a mature, fully containerized AWS estate (9 EKS clusters, 27 accounts, 228 Kubernetes nodes), an Akamai CDN layer managing live traffic splits, GitHub Actions + Jenkins CI/CD pipelines for a Webpack 5 micro-frontend monorepo, and an operational AI agent platform - OpsWhisperer - already in production monitoring 25 AWS accounts with a 91% autonomous resolution.

Your job is to extend all of it, automate what remains manual, and be the person who makes every deployment, incident, and infrastructure change happen with speed, precision, and intelligence.

SCOPE OF OWNERSHIP

What you'll own

AWS Multi-Account Infrastructure
  • EKS clusters across dedicated AWS accounts
  • EC2 worker nodes via Auto Scaling Groups
  • SQS pipelines
  • AWS Bedrock (Claude) for AI agent workloads

Kubernetes & Containerization
  • EKS clusters
  • Node group mgmt
  • Kops clusters alongside EKS
  • Multiple environment tiers with full blast-radius isolation

CI/CD & Release Management
  • Multiple Repos
  • GitHub Actions workflows + Jenkins pipeline management
  • Turbo build system across multiple micro-frontend packages
  • Canary release gating and rollback automation

CDN & Traffic Management
  • Akamai Property Manager config
  • Phased Release Cloudlet for Canary and Production split
  • Security, Throttling and Monitoring
  • Jenkins-driven cache invalidation


Observability & Incident Response
  • Elastic/Kibana
  • CloudWatch across all AWS accounts
  • Business performance monitoring
  • SQS backlog + pipeline health alerting
  • On-call ownership, proactive, AI-assisted triage


NON-NEGOTIABLE

The AI-native expectation

This is a role where AI fluency is not a bonus - it is how you do the job. We expect you to build, operate, and improve autonomous agents that handle monitoring, alerting, triage, and routine operational work. You are not just a consumer of AI tools; you are the person who builds them, deploys them into production, and iterates on them based on real operational data.

You will extend OpsWhisperer(AI Platform and Observability agent), contribute to the Axle platform, build MCP servers that give agents new capabilities, and apply LLM-powered reasoning to infrastructure problems that previously required multiple humans. If you've never built an agent that runs in production unsupervised, this is not the right role.

WHAT YOU'LL INHERIT & EXTEND

The tech stack

Category

Technologies

Cloud & Orchestration

AWS EKS • Kubernetes • Kops • AWS Organizations • Auto Scaling Groups • AWS SQS • AWS Bedrock • CloudWatch

CDN & Networking

Akamai Property Manager • Phased Release Cloudlet • Fast Purge • • Content Protector

CI/CD & Frontend

GitHub Actions • Jenkins • Turbo (monorepo) • Webpack 5 Module Federation • Canary / Blue-Green Deployments

AI & Agentic

MCP (Model Context Protocol) • Claude API / AWS Bedrock • Azure Bot Service • Microsoft Entra ID • Operational AI Agents

Observability & Data

Elastic / Kibana • BlueTriangle • Databricks • Cloudinary • New Relic

Languages

Node.js / TypeScript • Python • Bash / Shell • SQL • PowerShell

REQUIREMENTS

What we're looking for

  • 10+ years of hands-on DevOps, SRE, or platform engineering experience in production AWS cloud environments.
  • Deep AWS expertise: EKS, EC2, SQS, CloudWatch, IAM, Organizations, and multi-account architectures
  • Strong Kubernetes skills: cluster operations, node group management, workload isolation, taints/tolerations, auto-scaling
  • Experience with Akamai or equivalent enterprise CDN - configuration, purge operations, traffic routing rules
  • CI/CD ownership: GitHub Actions and/or Jenkins pipeline design, monorepo build systems, release gating
  • Production experience building or operating AI agents - LLM integration, autonomous workflow design, prompt engineering
  • Proficiency in Node.js and/or Python for automation, tooling, and MCP server development
  • Observability stack ownership: Elastic/Kibana, log analysis, alerting design, SLO/SLI instrumentation
  • Comfortable owning on-call responsibility for a production e-commerce platform with significant revenue exposure
  • Strong written and verbal communication - will interface with engineering leadership and present findings to executives
  • Based in or willing to relocate to the Los Angeles / Long Beach area for on-site work

About CarParts.com

CarParts.com is an online provider of aftermarket auto parts and accessories. The company was founded in 1999 and is headquartered in Carson, California. CarParts.com offers a wide selection of products from over 100 manufacturers, including replacement parts, performance parts, and accessories for cars, trucks, and SUVs. The company's website features a user-friendly interface that allows customers to easily search for and purchase the parts they need. CarParts.com also offers free shipping on orders over $50 and a 90-day return policy.
Learn more about CarParts.com
Size
1,529 employees
Market Cap
$319.5 million
Industry
Net Income
-$1.5 million
5 Year Trend
+13.9%
Revenue
$443.8 million
NASDAQ

Similar Jobs

More Jobs at CarParts.com

  • CarParts.com
    Staff AI Platform Engineer
    $120K — $160K *
    Long Beach, CA 90805 (Los Angeles County)
    Information Technology
    In-Person
  • CarParts.com
    Senior Data Analyst
    $100K — $140K *
    Long Beach, CA 90805 (Los Angeles County)
    Retail & Consumer Goods
    In-Person
  • CarParts.com
    Accounting Manager
    $120K — $150K *
    Long Beach, CA 90805 (Los Angeles County)
    Legal & Accounting
    In-Person

More Information Technology Jobs

Find similar Staff AI Platform Engineer jobs: