AI Operations & Quality Engineer

Summit Commercial Solutions, Inc.

$80K — $120K *
Finance & Insurance
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 2-4 years experience with LLM-powered applications or AI systems
  • Strong prompt engineering instincts
  • Proficient in Python for scripting and API calls
  • Experience in designing test cases for non-deterministic systems
  • Excellent diagnostic skills to identify agent issues
  • Strong written communication for clear reporting and documentation

Responsibilities

  • Run regression tests to detect issues in production agents
  • Design evaluation rubrics and analyze score patterns
  • Reproduce failures and implement necessary fixes
  • Audit deliverables for brand compliance
  • Curate and maintain the memory library in Synapse
  • Author reusable agent capabilities through scripting
  • Monitor model spend and optimize cost across agents
  • Collaborate with broker teams to translate intel into roadmaps

Benefits

  • Opportunity for Growth within an innovative team
  • Flexible hybrid working model in state-of-the-art offices
  • Access to cutting-edge technology and tools
  • Comprehensive health and mental health benefits
  • Supportive team culture with collaborative events
Full Job Description
About the role

Agent Quality Assurance

  • Run regression tests against canonical scenarios for every production agent - catch drift, broken outputs, and edge cases before brokers see them.
  • Design and calibrate evaluation rubrics for each agent's deliverables and analyze score patterns to flag systemic issues.
  • Reproduce failed runs surfaced by brokers, root-cause them, and ship fixes via memory updates, skill edits, or prompt revisions.
  • Audit branded deliverables for brand compliance, verifying compliance across fonts, logos, layouts, and data structure.

Skills, Memories, and Prompts

  • Curate the memory library inside Synapse: dedupe, retire stale facts, tune importance and "when to use" routing so agents pull the right context.
  • Author new skills (documented playbooks + Python scripts) that turn repeatable workflows into reusable agent capabilities, with credential handling and verified outputs.
  • Iterate on agent system prompts and sub-agent personas; keeping them tight, testable, and aligned with how brokers actually phrase requests.

Model & Cost Optimization

  • Monitor model spend across the agent fleet and tune model selection (Opus / Sonnet / Haiku and sub-agent defaults) to balance quality, latency, and cost.
  • Identify hot spots - agents over-spending on simple problems, or under-spending on hard ones, and rebalance.
  • Track token usage, latency, and error rates over time; publish a weekly fleet health report for the team.

Working with Brokers

  • Embed yourself with broker teams, when their requests surface bugs or feature gaps, you turn that intel directly into roadmap items.
  • Partner with our agent builders on every new agent. Review specs, design test plans, and sign off on launches.
  • Document agent capabilities, known limitations, and broker-facing workflows so AEs and brokers can use the fleet confidently.

Skills & Qualifications

Must-Haves:

  • 2-4 years working hands-on with LLM-powered applications, AI agents, or production prompt-engineered systems (formal CS degree not required if your project work demonstrates the skill set).
  • Strong instinct for prompt engineering. You can read a system prompt and explain why an agent is misbehaving.
  • Comfort writing Python at a scripting level (API calls, JSON manipulation, light data transformation) - you'll be authoring skill scripts.
  • Experience designing test cases and evaluation criteria for non-deterministic systems, or a clear conceptual framework for doing so.
  • Sharp diagnostic instincts: when an agent misfires, you can trace whether the issue lives in the memory, the skill, the prompt, the model, or the input.
  • Strong written communication - you'll translate "this agent broke" into a clean diagnosis and a fix log the rest of the team can act on.

Nice-to-Haves:

  • Hands-on experience with Anthropic's Claude or comparable frontier models (Opus / Sonnet / Haiku tier selection).
  • Familiarity with evaluation frameworks and rubric-based grading.
  • Insurance, financial services, or other regulated-document workflows where output accuracy is non-negotiable.
  • Exposure to HubSpot CRM, Slack, and Google Workspace as everyday operational tools.
  • Background in high-growth startups or AI-forward operations teams.
What Success Looks Like in the First 90 Days:
  • 30 days - You've shadowed every production agent on a real broker run, mapped each one's known failure modes, and stood up a regression test suite for the top 5.
  • 60 days - You've shipped your first round of memory cleanups, retired stale skills, and authored at least one new skill end-to-end.
  • 90 days - You're running a weekly fleet health review, model spend is trending down without quality regression, and broker-reported bugs have a clear triage and resolution path.

Perks We Offer at Summit

  • Opportunity for Growth: Work with a forward-thinking commercial brokerage and be part of an innovative, growing team.
  • Modern Workspace: Work from our brand-new, state-of-the-art offices in downtown Kelowna and Winnipeg. Enjoy a flexible hybrid model designed to support collaboration, focus, and balance.
  • Technology-Driven Culture: Work with cutting-edge tools and custom built technology to get time back in your day. Laptops & equipment provided to all staff.
  • Comprehensive Benefits: Access to flexible health, mental health, dental plans tailored to your lifestyle, and competitive vacation and personal day allotments.
  • Supportive Team: Participate in daily team huddles and collaborative events as part of a values-driven culture.

If you're ready to take the next step in your career and help build a new-age commercial insurance brokerage, we want to hear from you.

OTE: $80,000-$120,000

Join us in building the commercial brokerage of the future!

Similar Jobs

More Jobs at Summit Commercial Solutions, Inc.

More Finance & Insurance Jobs

Find similar AI Operations & Quality Engineer jobs: