Staff Engineer - Agentic AI

Clera

$130K — $180K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years in software engineering with 2+ years focused on agentic LLM-based agents.
  • Deep experience in designing LLM application architectures including model selection and context management.
  • Proven ability to build evaluation and benchmarking frameworks for task completion and cost efficiency.
  • Technical leadership experience directing small engineering teams (3-6 engineers).
  • Strong Python skills with familiarity in LLM tooling including function calling and evaluation frameworks.
  • Experience in desktop automation or programmatic control of applications.

Responsibilities

  • Lead the development of core agent intelligence for multi-step workflows across desktop tools.
  • Define agent capabilities from user stories and oversee their implementation.
  • Drive improvements in agent task success rates through evaluation frameworks and iteration.
  • Enforce budget constraints per task and track cost per completed workflow.
  • Establish a robust evaluation infrastructure grounded in validated user stories.
  • Conduct user story mapping and validation with domain experts to refine product direction.
  • Translate verified user stories into testable evaluations for benchmarking.

Benefits

  • Opportunity to work at an early-stage startup with a direct impact on Fortune 100 customers.
  • Collaborative environment with cross-functional teams including AI engineers and user researchers.
  • High-level visibility with direct reporting to the CTO.
  • Involvement in cutting-edge technology at the intersection of AI and desktop engineering tools.
Full Job Description
About the Role

We're hiring a senior technical leader to own the core agent intelligence that turns engineers' intent into reliable, cost-efficient multi-step workflows across desktop engineering tools. This role sits at the intersection of applied agentic AI, user research, and product delivery and will determine the product's real-world value to enterprise customers.

You'll report to the CTO and serve as technical lead for a small team of AI engineers, a user researcher, and domain expert contractors in an early-stage, high-impact environment (Series A, Fortune 100 customers, direct line to leadership).
What You'll Do
  • Lead development of the core agent intelligence layer that executes multi-step workflows across complex desktop engineering software.
  • Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows.
  • Drive agent task success rate by defining evaluation frameworks, establishing baselines, and iterating to improve completion metrics.
  • Set and enforce per-task token budgets and track cost per completed workflow to ensure commercial viability.
  • Build rigorous, reproducible evaluation infrastructure grounded in validated user stories.
  • Lead user story mapping and validation through interviews and close collaboration with domain experts.
  • Translate validated user stories into testable evals and close the loop between research and benchmarking.
  • Own agent architecture decisions including tool-calling, state management, error recovery, model routing, and context management.
  • Act as a player-coach: write production code, review designs, unblock the team, and raise engineering standards.
  • Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world usage.
What We're Looking For
  • 7+ years in software engineering, including at least 2 years building agentic LLM-based agents that act in the real world.
  • Deep experience designing LLM application architectures, including model selection, context/window management, retrieval, and orchestration patterns.
  • Proven ability to build evaluation and benchmarking frameworks measuring task completion, cost efficiency, and failure modes.
  • Technical leadership experience setting direction for small teams (3-6 engineers) and performing meaningful code review.
  • Strong Python skills and familiarity with LLM tooling (function calling, tool APIs, observability/tracing, evaluation frameworks).
  • Experience with desktop automation or programmatic control of applications (COM or similar).
  • Nice to have: Domain experience in mechanical engineering, CAD/CAE, PLM, or adjacent industries.
  • Nice to have: Understanding of enterprise deployment constraints on locked-down corporate workstations.
  • Nice to have: Track record contributing to public benchmarks, publications, or open-source agentic AI projects.

Similar Jobs

More Jobs at Clera

More Information Technology Jobs

Find similar Staff Engineer - Agentic AI jobs: