Staff Engineer - Agentic AI

Clera

$160K — $250K *
Enterprise Technology
5 - 7 years of experience
Job Overview by Ladders

Qualifications

  • 7+ years in software engineering with 2+ years building agentic LLM-based systems
  • Deep experience in LLM application architecture
  • Strong instincts for evaluating and benchmarking agentic systems
  • Proven track record of delivering AI systems with measurable outcomes
  • Strong Python skills and familiarity with LLM tooling ecosystem
  • Experience leading small technical teams

Responsibilities

  • Drive agent task success rate and improve workflow completion metrics
  • Set and enforce token budgets for problem-solving
  • Build rigorous evaluation infrastructure tied to user stories
  • Lead user story mapping and validation with researchers and experts
  • Expand workflow coverage for user story steps
  • Translate validated user stories into evaluation tests
  • Own the architecture of the agent system and its strategic decisions
  • Lead the team as a player-coach by providing guidance and writing production code
  • Collaborate with cross-functional teams during POCs

Benefits

  • Equity participation in a well-funded, early-stage company
  • Opportunity to work with enterprise customers in a high-impact role
  • Close collaboration with technical and user research teams
  • Direct reporting line to the CTO for influence over technology direction
  • Work in a vibrant San Francisco startup environment
Full Job Description
About the Role

This is a senior technical leadership role at a well-funded Series A startup building AI agents for hardware engineers. You'll own the core agent intelligence layer - the system that translates mechanical engineers' intent into reliable, cost-efficient multi-step workflows across complex desktop engineering tools (CAD, simulation, PLM, and more).

You'll report directly to the CTO and serve as technical lead for a small team of AI engineers, a user researcher, and domain expert contractors. This role sits at the intersection of applied agentic AI, user research, and product delivery - and it directly determines the product's real-world value to enterprise customers.

The company serves Fortune 100 customers and has significant backing from top-tier investors. This is a high-impact, on-site role based in San Francisco.
What You'll Do
  • Drive agent task success rate. Own the metric that matters most - define the eval framework, establish baselines, and systematically improve whether the agent can complete the workflows engineers actually need.
  • Set and enforce token budgets per problem. Define per-task token budgets, track cost per completed workflow, and ensure commercial viability - not just technical impressiveness.
  • Build rigorous evaluation infrastructure. Design benchmarks grounded in real user stories with SWE-bench-level rigor - reproducible, adversarial, and tied to measurable customer value.
  • Lead user story mapping and validation. Work directly with the user researcher and domain experts to interview engineers, document workflows in detail, and validate that what you're building against reflects reality.
  • Expand workflow coverage. Systematically grow the percentage of top user story steps the agent handles end-to-end, prioritizing by customer value and technical feasibility.
  • Translate user stories into evals. Close the loop between user research and agent benchmarking - every validated user story becomes a test case.
  • Own the agent architecture. Make foundational decisions on tool-calling strategies, state management, error recovery, model routing, and context management.
  • Lead as a player-coach. Set technical direction, review architecture decisions, write production code, unblock the team, and raise the engineering bar.
  • Collaborate cross-functionally with integrations, product, and customers during POCs to align agent behavior with real-world enterprise usage.
What We're Looking For

Dealbreakers (must-haves):
  • 7+ years in software engineering, with at least 2 years building agentic LLM-based systems - agents that call tools, manage multi-step workflows, handle failures, and operate under cost constraints.
  • Deep experience with LLM application architecture: model selection, context window management, retrieval strategies, tool-calling frameworks, and orchestration patterns.
  • Strong evaluation and benchmarking instincts for agentic systems - task completion, cost efficiency, failure mode analysis; familiarity with benchmarks such as SWE-bench, GAIA, or -bench.

Required:
  • Proven track record of shipping AI systems with measurable outcomes (agent task success rate, cost efficiency) - not just demos.
  • Strong Python skills and working knowledge of the LLM tooling ecosystem: function calling, tool use APIs, tracing/observability tools (e.g., Logfire, LangSmith), and evaluation frameworks.
  • Experience leading a small technical team (3-6 engineers): setting technical direction, performing code reviews, driving architecture decisions.

Nice to Have:
  • Published work or open-source contributions in agentic AI systems.
  • Familiarity with enterprise deployment constraints - agent behavior on locked-down corporate workstations.
  • Experience with desktop automation, COM, or programmatic control of applications (beyond web APIs).
  • Background in mechanical engineering, CAD/CAE, PLM, or adjacent industries.
  • Experience building or contributing to public AI agent benchmarks.
Compensation & Benefits
  • Salary: $160,000 - $250,000 USD annually, depending on experience.
  • Equity participation in a well-funded, early-stage company with significant enterprise traction.
  • Visa sponsorship: Not available.
Location

This is an on-site role based in San Francisco, CA. Remote work is not available for this position.

Similar Jobs

More Jobs at Clera

  • Founding Engineer
    $150K — $250K *
    San Francisco, CA 94112 (San Francisco County)
    Enterprise Technology
    In-Person
  • Full-Stack Engineer
    $90K — $200K *
    San Francisco, CA 94112 (San Francisco County)
    Enterprise Technology
    In-Person
  • Backend Engineer
    $80K — $120K *
    San Francisco, CA 94112 (San Francisco County)
    Energy & Utilities
    In-Person
  • Forward Deployed Engineer
    $80K — $130K *
    San Francisco, CA 94112 (San Francisco County)
    Healthcare
    In-Person
  • Founding Product Designer
    $80K — $160K *
    San Francisco, CA 94112 (San Francisco County)
    Consumer Technology
    In-Person

More Enterprise Technology Jobs

Find similar Staff Engineer - Agentic AI jobs: