CLA is growing and seeking to hire an experienced
Lead Machine Learning Operations Engineer to join our talented team. This role manages a team of Machine Learning Operations Engineers, oversees the end-to-end machine-learning strategy and execution, sets vision for MLOps, and ensures alignment with business goals.
How you'll create opportunities in this role:• Define and execute an enterprise AI/ML platform strategy, encompassing MLOps, LLMOps, and AIOps, and build reusable frameworks and standards adopted across multiple projects and business units.
• Oversee enterprise-scale AI platforms supporting model training, inference, evaluation, monitoring, retraining, and governance, including generative AI systems.
• Align AI and MLOps initiatives with business objectives, ensuring platforms and pipelines meet scalability, performance, security, regulatory, and cost requirements, including responsible and ethical AI considerations.
• Implement and enforce best practices for model and prompt versioning, monitoring, retraining, and automated workflows, ensuring consistent and reliable AI operations.
• Lead teams delivering shared AI infrastructure, tooling, and platforms, providing day-to-day leadership through coaching, development, and performance management.
• Ensure platform reliability and operational excellence by overseeing escalated issue resolution, maintaining high-quality documentation, and driving continuous improvement.
• Track and evaluate industry trends in AI platforms, LLM ecosystems, and AI operations, translating insights into roadmap decisions and platform evolution.
What you will need:6 years of relevant experience required.
- Experience in MLOps, DevOps, or related fields, with a focus on enterprise-level solutions preferred.
- Supervisory experience preferred.
Education Bachelor's degree is required. Combination of relevant experience, education, and training may be accepted in lieu of degree.
- Degree in computer science, data science, or related field preferred.
Technical Competencies - Advanced proficiency in Python and architectural mastery of object-oriented design across dynamically typed languages.
- Broad experience integrating and governing multi-language systems, including Python, JavaScript/TypeScript, and enterprise platforms (e.g., .NET).
- Leadership-level expertise in AI/ML platform engineering, spanning MLOps, LLMOps, and AIOps.
- Ability to define and enforce enterprise standards for AI model lifecycle management, monitoring, reliability, and cost control.
- Deep understanding of AI system observability, including drift detection, evaluation frameworks, and incident response.
- Strong experience with cloud architecture, security, compliance, and enterprise-scale deployments.
- Proven ability to guide teams in technical decision-making and platform strategy.
#LI-JH1
Wellness at CLATo support our CLA family members, we focus on their physical, financial, social, and emotional well-being and offer comprehensive benefit options that include health, dental, vision, 401k and much more.
To view a complete list of benefits, click here.