Applied AI Platform & DevOps EngineerOpportunity ID
9624
Department
Practice Management
Location(s)
Austin
State
Texas
Function
GenAI
Job Description
We currently have an exciting career opportunity for a
Applied AI Platform & DevOps Engineer Senior Manager to join the
Strategic AI team.
CohnReznick is a hybrid firm and most of our professionals are located within a commutable distance to one of our offices. This position is considered remote which means it does not require job duties be performed within proximity of a CohnReznick office location. However, as a remote employee, you may be required to be present at a CohnReznick office with scheduled notice for client work, team meetings, or trainings.
YOUR TEAM.We are seeking an Applied AI Platform & DevOps Engineer to own the end-to-end platform, environments, and operational lifecycle for AI-enabled platforms and applications built by the Strategic AI team. This role is responsible for all environments, CI/CD pipelines, hosting, reliability, availability, security, and operational tooling-with AI embedded natively into the software development lifecycle (SDLC).
This is a hands-on engineering role. You will write production code, build and operate shared platforms, and ensure that AI applications developed by the Strategic AI team are production-ready, secure, observable, and scalable.
This role does not focus on feature delivery. It owns how AI software is built, deployed, and run.
YOUR ROLE.Responsibilities include but not limited to:Own AI-Native Environments & CI/CD- Design, implement, and operate end-to-end CI/CD pipelines for AI-enabled platforms and applications.
- Manage versioning, promotion, and rollback of:
- Embed AI-aware testing into CI/CD:
- Implement safe deployment patterns (canary, shadow, feature flags).
Own Hosting & Runtime for Strategic AI Applications- Own the hosting and execution environment for AI services (APIs, background jobs, agents, workflows).
- Design and operate inference orchestration patterns (managed APIs, hybrid or local models as needed).
- Implement reliability mechanisms: caching, batching, retries, fallbacks, circuit breakers.
- Partner with core infrastructure teams, while retaining ownership of the AI runtime layer.
Reliability, Availability, and Security of AI Systems- Define and enforce SLOs/SLAs for AI platforms and applications.
- Build and maintain AI-specific observability, including:
- Own security implementation for AI systems:
- Ensure AI platforms meet enterprise security, privacy, and compliance standards.
AI-Native Software Development Lifecycle- Embed AI capabilities directly into the SDLC:
- Define standards for how Strategic AI builds, reviews, deploys, and operates AI software.
- Continuously improve tooling and workflows to reduce manual effort and operational risk.
Platform & Tooling Development- Build and maintain shared internal tools, libraries, and templates used by the Strategic AI team.
- Create "golden paths" for common AI patterns (e.g., RAG, agent workflows, orchestration).
- Reduce friction for Applied AI Engineers by providing reliable, well-documented infrastructure and tooling.
Future / Optional Scope (Secondary Priority)- As capacity allows, contribute to processes or tooling that help transition experimental or business-built AI prototypes into managed Strategic AI platforms.
- This is not a primary responsibility and does not detract from core platform ownership.
YOUR EXPERIENCE.The successful candidate will have:- 6+ years of experience in DevOps, Platform Engineering, SRE, or Backend Engineering, with ownership of production systems.
- Strong hands-on coding experience (e.g., Python, TypeScript/Node.js, Java, or C#).
- Deep experience with CI/CD, environment management, and infrastructure automation.
- Proven experience owning production hosting, reliability, and availability for distributed systems.
- Strong understanding of authentication, authorization, and secure service integration.
- Comfort being accountable for operational outcomes in production environments.
Preferred Qualifications- Experience operatingAI-enabled applications in production (LLMs, RAG, agentic workflows).
- Familiarity with retrieval systems, embeddings, and vector databases.
- Experience implementing observability and runtime controls beyond basic infrastructure metrics.
- Experience working in regulated or security-conscious environments.
In addition, please take a moment to review our
Studies have shown that we are less likely to apply to jobs unless we meet every single qualification. At CohnReznick, we are dedicated to building a diverse, equitable, and inclusive workplace, so if you're excited about this role but your experience doesn't align perfectly with every qualification in the job description, we still encourage you to apply. You may be just the right candidate for this or one of our other roles.