About the roleAgent Quality Assurance- Run regression tests against canonical scenarios for every production agent - catch drift, broken outputs, and edge cases before brokers see them.
- Design and calibrate evaluation rubrics for each agent's deliverables and analyze score patterns to flag systemic issues.
- Reproduce failed runs surfaced by brokers, root-cause them, and ship fixes via memory updates, skill edits, or prompt revisions.
- Audit branded deliverables for brand compliance, verifying compliance across fonts, logos, layouts, and data structure.
Skills, Memories, and Prompts- Curate the memory library inside Synapse: dedupe, retire stale facts, tune importance and "when to use" routing so agents pull the right context.
- Author new skills (documented playbooks + Python scripts) that turn repeatable workflows into reusable agent capabilities, with credential handling and verified outputs.
- Iterate on agent system prompts and sub-agent personas; keeping them tight, testable, and aligned with how brokers actually phrase requests.
Model & Cost Optimization- Monitor model spend across the agent fleet and tune model selection (Opus / Sonnet / Haiku and sub-agent defaults) to balance quality, latency, and cost.
- Identify hot spots - agents over-spending on simple problems, or under-spending on hard ones, and rebalance.
- Track token usage, latency, and error rates over time; publish a weekly fleet health report for the team.
Working with Brokers- Embed yourself with broker teams, when their requests surface bugs or feature gaps, you turn that intel directly into roadmap items.
- Partner with our agent builders on every new agent. Review specs, design test plans, and sign off on launches.
- Document agent capabilities, known limitations, and broker-facing workflows so AEs and brokers can use the fleet confidently.
Skills & QualificationsMust-Haves:- 2-4 years working hands-on with LLM-powered applications, AI agents, or production prompt-engineered systems (formal CS degree not required if your project work demonstrates the skill set).
- Strong instinct for prompt engineering. You can read a system prompt and explain why an agent is misbehaving.
- Comfort writing Python at a scripting level (API calls, JSON manipulation, light data transformation) - you'll be authoring skill scripts.
- Experience designing test cases and evaluation criteria for non-deterministic systems, or a clear conceptual framework for doing so.
- Sharp diagnostic instincts: when an agent misfires, you can trace whether the issue lives in the memory, the skill, the prompt, the model, or the input.
- Strong written communication - you'll translate "this agent broke" into a clean diagnosis and a fix log the rest of the team can act on.
Nice-to-Haves:- Hands-on experience with Anthropic's Claude or comparable frontier models (Opus / Sonnet / Haiku tier selection).
- Familiarity with evaluation frameworks and rubric-based grading.
- Insurance, financial services, or other regulated-document workflows where output accuracy is non-negotiable.
- Exposure to HubSpot CRM, Slack, and Google Workspace as everyday operational tools.
- Background in high-growth startups or AI-forward operations teams.
What Success Looks Like in the First 90 Days:- 30 days - You've shadowed every production agent on a real broker run, mapped each one's known failure modes, and stood up a regression test suite for the top 5.
- 60 days - You've shipped your first round of memory cleanups, retired stale skills, and authored at least one new skill end-to-end.
- 90 days - You're running a weekly fleet health review, model spend is trending down without quality regression, and broker-reported bugs have a clear triage and resolution path.
Perks We Offer at Summit- Opportunity for Growth: Work with a forward-thinking commercial brokerage and be part of an innovative, growing team.
- Modern Workspace: Work from our brand-new, state-of-the-art offices in downtown Kelowna and Winnipeg. Enjoy a flexible hybrid model designed to support collaboration, focus, and balance.
- Technology-Driven Culture: Work with cutting-edge tools and custom built technology to get time back in your day. Laptops & equipment provided to all staff.
- Comprehensive Benefits: Access to flexible health, mental health, dental plans tailored to your lifestyle, and competitive vacation and personal day allotments.
- Supportive Team: Participate in daily team huddles and collaborative events as part of a values-driven culture.
If you're ready to take the next step in your career and help build a new-age commercial insurance brokerage, we want to hear from you.
OTE: $80,000-$120,000
Join us in building the commercial brokerage of the future!