PRX- AI Engineer

Apexon

$120K — $160K *
Enterprise Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of software development experience in languages like Python, C/C++, Go, or Java, with a preference for Python applications.
  • 3+ years of experience in designing and deploying production ML systems, including monitoring and model fine-tuning.
  • Practical experience with Large Language Models (LLMs), including API integration and prompt engineering.
  • Knowledge of various LLMs, both commercial and open-source, and their functionalities.
  • Strong understanding of statistics and machine learning concepts to ensure effective solutions.
  • Excellent problem-solving and communication skills with a focus on business impact.
  • Preferred experience with cloud infrastructure, particularly AWS and containerized services.

Responsibilities

  • Build agentic AI systems by designing secure, robust tool-calling agents for production environments.
  • Productionize LLMs through frameworks for evaluation, retrieval, and automated feedback.
  • Integrate agents within existing ecosystems to enhance automated diagnostics and incident management.
  • Collaborate with production engineers to create AI solutions that address operational challenges and align with business goals.
  • Ensure safety and governance in AI deployment through validation, policy checks, and continuous monitoring.
  • Optimize performance and costs through advanced prompt engineering and model management strategies.
  • Develop a RAG pipeline for maintaining knowledge quality and relevance in agentic AI solutions.

Benefits

  • Health Insurance with Dental & Vision
  • 401K Plan
  • Life Insurance, Short-Term & Long-Term Disability
  • Paid Vacations & Holidays
  • Paid Parental Leave
  • Flexible Spending Accounts for dependent and limited purpose care
Full Job Description
JD: In this role, you will be responsible for launching and implementing GenAI agentic solutions aimed at reducing the risk and cost of managing large-scale production environments with varying complexities. You will address various production runtime challenges by developing agentic AI solutions that can diagnose, reason, and take actions in production environments to improve productivity and address issues related to production support.

What you'll do:
  • Build agentic AI systems: Design and implement tool-calling agents that combine retrieval, structured reasoning, and secure action execution (function calling, change orchestration, policy enforcement) following MCP protocol. Engineer robust guardrails for safety, compliance, and least-privilege access.
  • Productionize LLMs: Build evaluation framework for open-source and foundational LLMs; implement retrieval pipelines, prompt synthesis, response validation, and self-correction loops tailored to production operations.
  • Integrate with runtime ecosystems: Connect agents to observability, incident management, and deployment systems to enable automated diagnostics, runbook execution, remediation, and post-incident summarization with full traceability.
  • Collaborate directly with users: Partner with production engineers, and application teams to translate production pain points into agentic AI roadmaps; define objective functions linked to reliability, risk reduction, and cost; and deliver auditable, business-aligned outcomes.
  • Safety, reliability, and governance: Build validator models, adversarial prompts, and policy checks into the stack; enforce deterministic fallbacks, circuit breakers, and rollback strategies; instrument continuous evaluations for usefulness, correctness, and risk.
  • Scale and performance: Optimize cost and latency via prompt engineering, context management, caching, model routing, and distillation; leverage batching, streaming, and parallel tool-calls to meet stringent SLOs under real-world load.
  • Build a RAG pipeline: Curate domain-knowledge; build data-quality validation framework; establish feedback loops and milestone framework maintain knowledge freshness.
  • Raise the bar: Drive design reviews, experiment rigor, and high-quality engineering practices; mentor peers on agent architectures, evaluation methodologies, and safe deployment patterns

1.Role Requirements: understand what skills, experience, and qualities you are looking for.

ESSENTIAL SKILLS

1. 5+ years of software development in one or more languages (Python, C/C++, Go, Java); strong hands-on experience building and maintaining large-scale Python applications preferred.

2. 3+ years designing, architecting, testing, and launching production ML systems, including model deployment/serving, evaluation and monitoring, data processing pipelines, and model fine-tuning workflows.

3. Practical experience with Large Language Models (LLMs): API integration, prompt engineering, fine-tuning/adaptation, and building applications using RAG and tool-using agents (vector retrieval, function calling, secure tool execution).

4. Understanding of different LLMs, both commercial and open source, and their capabilities (e.g., OpenAI, Gemini, Llama, Qwen, Claude).

5. Solid grasp of applied statistics, core ML concepts, algorithms, and data structures to deliver efficient and reliable solutions.

6. Strong analytical problem-solving, ownership, and urgency; ability to communicate complex ideas simply and collaborate effectively across global teams with a focus on measurable business impact.

7. Preferred: Proficiency building and operating on cloud infrastructure (ideally AWS), including containerized services (ECS/EKS), serverless (Lambda), data services (S3, DynamoDB, Redshift), orchestration (Step Functions), model serving (SageMaker), and infra-as-code (Terraform/CloudFormation).

Our Perks and Benefits:

Our benefits and rewards program has been thoughtfully designed to recognize your skills and contributions, elevate your learning/upskilling experience and provide care and support for you and your loved ones. As an Apexon Associate, you get continuous skill-based development, opportunities for career advancement, and access to comprehensive health and well-being benefits and assistance.

We also offer:

o Health Insurance with Dental & Vision

o 401K Plan

o Life Insurance, STD & LTD

o Paid Vacations & Holidays

o Paid Parental Leave

o FSA Dependent & Limited Purpose care

Similar Jobs

More Jobs at Apexon

More Enterprise Technology Jobs

Find similar PRX- AI Engineer jobs: