About the roleWe're hiring a
Senior Backend Engineer to own AI features end to end, from rapid prototype to production and the evaluation that keeps them honest. You'll build the APIs, tool-using agents, and RAG pipelines that turn frontier LLMs into grant discovery, application drafting, and research tools our 5,500+ nonprofits rely on every day. It's a high-ownership seat on a small team, where what you ship reaches customers fast and you help shape how we build AI here.
What you'll doShip AI to production- Build tool-using LLM agents (task planning, function and tool calling, multi-step workflows, guardrails) for grant discovery, application drafting, and research assistance.
- Turn prototypes into resilient, observable services with clear SLAs, rollback and fallback strategies, and cost and latency budgets.
- Stand up evaluation and observability so our AI stays grounded, safe, and cost-effective.
Build trustworthy backends- Write high-quality, thoroughly tested code across the backend and the data pipelines that power retrieval and evaluation.
- Contribute to reliability practices: alerts, dashboards, and incident response.
Collaborate and raise the bar- Partner with Product, Design, and GTM on scoping, UX, and measurement.
- Run experiments (A/B, canaries), interpret results, and iterate.
- Raise engineering standards through clear, maintainable code, tests, docs, and thoughtful review.
What we're looking forRequired- 7+ years building and shipping production backend systems in Python (FastAPI, Celery, or equivalent), taking features from prototype to production with real reliability practices like tests, observability, and rollback.
- Hands-on experience building LLM features in production: tool and function calling, multi-step agent workflows, and the guardrails and evals that keep them grounded, safe, and cost-effective. This is the core of the role.
- Strong data fundamentals: SQL, schema design, and building pipelines that power retrieval and evaluation.
- Thrives in a fast, scrappy startup environment with high ownership and a bias for action, speed, quality, and simplicity.
Nice to have- TypeScript and Node, plus familiarity with Ruby on Rails (our core platform) or a willingness to learn it.
- Experience with AWS or GCP, Docker, CI/CD, and observability (logs, metrics, traces).
- RAG depth: document ingestion, chunking and windowing, embeddings, hybrid search (keyword plus vector), re-ranking, and grounded citations.
- Experience with re-rankers and cross-encoders, hybrid retrieval tuning, or search and recommendation systems.
- Evaluation mindset: designing eval suites (RAG/QA, extraction, summarization) using automated and human-in-the-loop methods, with familiarity with frameworks like Ragas, DeepEval, or OpenAI Evals.
- Orchestration frameworks: LangChain or LangGraph, LlamaIndex, Semantic Kernel, or custom orchestration.
Compensation & BenefitsFor US-based candidates, the target salary range for this role is
175,000 - $220,000 USD, plus equity. Final compensation is determined based on experience, skillset, scope of responsibility, interview performance, and geographic location. We're committed to paying competitively and equitably.
For candidates based in Canada, compensation varies by province and will be shared by your recruiter early in the process.
Benefits- 100% covered health, dental, and vision insurance for employees (50% for dependents)
- Generous PTO, including parental leave
- 401(k)
- Company laptop and home-office stipend
- Bi-annual company retreats
- Instrumentl is evolving rapidly. You'll always have new challenges and opportunities to grow here.