Staff Engineer - Agentic AI

Clera

$160K — $250K *
Information Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years of software engineering experience, including 2 years with agentic LLM-based agents.
  • Expertise in designing LLM application architectures and tool-calling frameworks.
  • Strong abilities in evaluation and benchmarking of agentic systems with measurable outcomes.
  • Proficient in Python and experience with LLM tooling and observability tools.
  • Experience leading small technical teams and driving architectural decisions.

Responsibilities

  • Lead the development of an agent intelligence layer for complex engineering workflows.
  • Serve as the technical lead for a team of AI engineers and domain experts.
  • Define agent capabilities based on user stories and implement them effectively.
  • Drive improvements in agent task success rates and establish evaluation frameworks.
  • Manage operational aspects like token budgets and workflow costs.
  • Build reproducible evaluation infrastructures for validated user stories.
  • Collaborate cross-functionally to align agent behavior with real-world use cases.

Benefits

  • Equity participation in an early-stage, Series A startup.
  • Opportunity to work directly with executive leadership.
  • Involvement in high-impact projects serving Fortune 100 clients.
Full Job Description
About the Role

A well-funded, early-stage AI startup in the mechanical engineering software space is looking for a Staff Engineer - Agentic AI to own the core agent intelligence layer that turns engineers' intent into reliable, cost-efficient multi-step workflows across complex desktop engineering tools. This is a high-impact, senior technical leadership role reporting directly to the CTO, sitting at the intersection of applied agentic AI, user research, and product delivery.

The company serves Fortune 100 hardware engineering customers and is backed by notable investors. You'll join a small, senior team and have a direct line to executive leadership. The role is on-site in San Francisco, CA.
What You'll Do
  • Lead development of the core agent intelligence layer executing multi-step workflows across complex desktop engineering software (CAD, CAE, PLM).
  • Report to the CTO and serve as technical lead for a small team of AI engineers, a user researcher, and domain expert contractors.
  • Own the full product loop: define agent capabilities from user stories, build implementations, and benchmark against real workflows.
  • Drive agent task success rate - define the eval framework, establish baselines, and systematically improve completion metrics.
  • Set and enforce per-task token budgets; track cost per completed workflow to ensure commercial viability.
  • Build rigorous, reproducible evaluation infrastructure grounded in validated user stories (SWE-bench-level rigor applied to engineering workflows).
  • Lead user story mapping and validation through direct interviews and collaboration with domain experts.
  • Translate validated user stories into testable evals, closing the loop between research and benchmarking.
  • Own agent architecture decisions: tool-calling strategies, state management, error recovery, model routing, and context management.
  • Act as a player-coach: write production code, review designs, unblock the team, and raise engineering standards.
  • Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world usage.
What We're Looking For

Required (Dealbreakers):
  • 7+ years in software engineering, including at least 2 years building agentic LLM-based agents that act in the real world (tool-calling, multi-step workflows, failure handling, cost constraints).
  • Deep experience designing LLM application architectures: model selection, context/window management, retrieval strategies, tool-calling frameworks, and orchestration patterns.
  • Strong evaluation and benchmarking instincts for agentic systems - task completion, cost efficiency, failure mode analysis; familiarity with SWE-bench, GAIA, or -bench.
  • Proven track record shipping AI systems with measurable outcomes (e.g., agent task success rate, cost efficiency) - not just demos.
  • Strong Python skills and hands-on experience with LLM tooling (function calling, tool use APIs, tracing/observability tools such as Logfire or LangSmith, evaluation frameworks).
  • Experience leading a small technical team (3-6 engineers): setting direction, performing code reviews, driving architecture decisions.

Strongly Preferred:
  • Experience with desktop automation, COM, or programmatic control of applications (beyond web APIs).
  • Background in mechanical engineering, CAD/CAE, PLM, or adjacent industries.
  • Familiarity with enterprise deployment constraints - agent behavior on locked-down corporate workstations.
  • Published work or open-source contributions in agentic AI systems.
  • Experience building or contributing to public benchmarks for AI agents.
Compensation & Benefits
  • Salary: $160,000 - $250,000 USD annually, depending on experience.
  • Equity participation in an early-stage, Series A company.
  • Note: Visa sponsorship is not available for this role.
Location
  • On-site in San Francisco, CA, United States.
  • This is not a remote role.

Similar Jobs

More Jobs at Clera

More Information Technology Jobs

Find similar Staff Engineer - Agentic AI jobs: