Staff Engineer - Agentic AI

Clera

$160K — $250K *
Enterprise Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years in software engineering
  • 2+ years building agentic LLM-based systems
  • Deep experience with LLM application architecture
  • Strong evaluation and benchmarking instincts for agentic systems
  • Proficiency in Python and LLM tooling ecosystem
  • Experience leading a small technical team (3-6 engineers)

Responsibilities

  • Lead development of the core agent intelligence layer
  • Own the full product loop from user stories to implementation
  • Drive agent task success rate and improve completion metrics
  • Set and enforce per-task token budgets and track costs
  • Build rigorous, reproducible evaluation infrastructure
  • Lead user story mapping and validation with domain experts
  • Translate validated user stories into testable evaluations
  • Own agent architecture decisions and set technical direction
  • Collaborate cross-functionally during POCs to align agent behavior

Benefits

  • Early-stage equity
  • Direct line to executive leadership
  • Outsized scope of impact
Full Job Description
About the Role

A well-funded, early-stage B2B SaaS company building AI agent infrastructure for mechanical engineering workflows is hiring a Staff Engineer - Agentic AI to own the core agent intelligence layer. This is a high-impact, senior technical leadership role reporting directly to the CTO. You'll sit at the intersection of applied agentic AI, user research, and product delivery - determining real-world value for Fortune 100 enterprise customers in the CAD, CAE, and PLM space.

You'll lead a small team of AI engineers, a user researcher, and domain expert contractors, acting as a player-coach who writes production code and sets technical direction.
What You'll Do
  • Lead development of the core agent intelligence layer that executes multi-step workflows across complex desktop engineering software.
  • Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows.
  • Drive agent task success rate - define the evaluation framework, establish baselines, and systematically improve completion metrics.
  • Set and enforce per-task token budgets; track cost per completed workflow to ensure commercial viability.
  • Build rigorous, reproducible evaluation infrastructure grounded in validated user stories (SWE-bench-level rigor applied to engineering workflows).
  • Lead user story mapping and validation through interviews and close collaboration with domain experts.
  • Translate validated user stories into testable evals and close the loop between research and benchmarking.
  • Own agent architecture decisions: tool-calling strategies, state management, error recovery, model routing, and context management.
  • Set technical direction, review architecture decisions, unblock the team, and raise the engineering bar across a team of 3-6 engineers.
  • Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world usage.
What We're Looking For

Must-haves:
  • 7+ years in software engineering, including at least 2 years building agentic LLM-based systems that act in the real world (multi-step workflows, tool-calling, failure handling, cost constraints).
  • Deep experience with LLM application architecture: model selection, context/window management, retrieval strategies, tool-calling frameworks, and orchestration patterns.
  • Strong evaluation and benchmarking instincts for agentic systems - task completion, cost efficiency, and failure mode analysis; familiarity with SWE-bench, GAIA, or -bench.
  • Proven track record of shipping AI systems with measurable outcomes, not just demos.
  • Proficiency in Python and the LLM tooling ecosystem (function calling, tool use APIs, tracing/observability tools such as Logfire or LangSmith, evaluation frameworks).
  • Experience leading a small technical team (3-6 engineers): setting direction, performing code reviews, and driving architecture decisions.

Nice-to-haves:
  • Experience with desktop automation, COM, or programmatic control of applications (beyond web APIs).
  • Background in mechanical engineering, CAD/CAE, PLM, or adjacent industries.
  • Familiarity with enterprise deployment constraints on locked-down corporate workstations.
  • Published work or open-source contributions in agentic AI systems.
  • Experience building or contributing to public benchmarks for AI agents.

Note: Visa sponsorship is not available for this role.
Compensation & Benefits
  • Salary: $160,000 - $250,000 USD annually
  • Early-stage equity
  • Direct line to executive leadership and outsized scope of impact
Location

This is an on-site role based in San Francisco, CA. Candidates must be willing to work from the office.

Similar Jobs

More Jobs at Clera

More Enterprise Technology Jobs

Find similar Staff Engineer - Agentic AI jobs: