AI DevOps Engineer

JSR Tech Consulting

$120K — $160K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of DevOps, platform engineering, or SRE experience with ownership of production infrastructure.
  • Proficiency with container orchestration (Kubernetes) and infrastructure-as-code (Terraform, Pulumi, or CDK).
  • Experience integrating third-party SaaS APIs into enterprise engineering pipelines.
  • Strong observability fundamentals with tools like Datadog or Grafana.
  • Demonstrated experience designing sandboxed execution environments for automated workloads.
  • Solid security hygiene skills, including secret management and audit logging.
  • Direct experience with prompt engineering and LLM-based developer tools.

Responsibilities

  • Design and operate infrastructure for the AI Software Development platform.
  • Build and maintain sandboxed execution environments for AI agents.
  • Own CI/CD integration for AI tooling within existing engineering pipelines.
  • Instrument the platform for full observability metrics like latency and token consumption.
  • Manage model API integrations, including governance and cost attribution.
  • Collaborate with InfoSec on security reviews for new platform capabilities.
  • Evaluate emerging infrastructure patterns for evolving AI workloads.

Benefits

  • Collaborative team environment with a focus on innovation.
  • Opportunities for professional development and skill enhancement.
  • Engagement with cutting-edge AI technologies and tools.
Full Job Description
s the DevOps Engineer for the AI Platform, you build and operate the infrastructure layer that makes AI tooling reliable at scale and that provides the safe, instrumented execution environment AI agents require to operate in production. You collaborate closely with the InfoSec Engineer - working through the security review and approval process for every new capability introduced to the platform - and you ensure the infrastructure is observable, resilient, and ready to evolve as the platform grows.

KEY RESPONSIBILITIES

  • Design and operate the infrastructure layer for the AI Software Development platform - compute environments, API gateway configurations, token usage governance, and service reliability.
  • Build and maintain sandboxed execution environments for AI agents - isolating agent workloads and enabling safe rollback on task failure.
  • Own CI/CD integration for AI tooling - ensuring IDE assistants, automated review tools, and agent triggers are wired into existing engineering pipelines with minimal friction.
  • Instrument the platform for full observability: latency, token consumption, task throughput, error rates, cost-per-task, and agent utilization metrics.
  • Manage model API integrations - OpenAI, Anthropic, GitHub, and others - including rate limit governance, failover logic, and cost attribution by team and use case.
  • Collaborate with the InfoSec Engineer throughout the security review and approval process for all new platform capabilities, ensuring each integration meets policy requirements before deployment.
  • Evaluate and adopt emerging infrastructure patterns for AI workloads as the platform evolves.

REQUIRED QUALIFICATIONS

  • 5+ years of DevOps, platform engineering, or SRE experience with ownership of production infrastructure.
  • Proficiency with container orchestration (Kubernetes) and infrastructure-as-code (Terraform, Pulumi, or CDK).
  • Experience integrating third-party SaaS APIs into enterprise engineering pipelines - auth, rate limiting, cost governance.
  • Strong observability fundamentals - experience with Datadog, Grafana, OpenTelemetry, or equivalent tooling.
  • Demonstrated experience designing sandboxed or isolated execution environments for automated workloads.
  • Solid security hygiene: secret management, least-privilege IAM, network egress control, and audit logging.
  • Direct experience with prompt engineering and LLM-based developer tools, and practical familiarity with how they are deployed and operated.
  • Familiarity with AI capability benchmarks - including SWE-bench, METR research, and similar frameworks - sufficient to inform infrastructure planning decisions.

NICE TO HAVE

  • Prior experience building infrastructure for LLM-based applications or AI agent workloads.
  • Familiarity with vector databases (Pinecone, Weaviate, pgvector) and embedding pipeline operations.
  • Experience with GPU-backed compute provisioning for on-premises or hybrid inference workloads.
  • Cost attribution and FinOps experience in multi-team API consumption environments.
  • Background in developer experience tooling infrastructure - telemetry pipelines, IDE plugin distribution, or SCM integrations.

Similar Jobs

More Jobs at JSR Tech Consulting

More Information Technology Jobs

Find similar AI DevOps Engineer jobs: