The RoleRo is building a team focused on shipping LLM-powered products across the patient experience, clinical operations, and internal tooling.
We're hiring an Applied AI Scientist to help measure, evaluate, and improve our AI systems. You'll answer one of the most important questions in applied AI: "Is this actually working?" You'll design evaluations, analyze production behavior, run experiments, and partner closely with engineers to improve the quality of AI-powered features.
You'll work on real production systems while learning modern evaluation techniques from experienced teammates.
What You'll Do- Build evaluation datasets, rubrics, and synthetic test cases for LLM-powered features across patient and internal workflows.
- Analyze production logs to identify model failures, hallucinations, quality issues, and operational bottlenecks.
- Design and run experiments end-to-end, including hypothesis development, dataset creation, evaluation, analysis, and recommendations.
- Track key product and operational metrics including resolution rate, handle time, touches to resolution, latency, and quality and identify opportunities for improvement.
- Partner with engineers to validate improvements and productionize successful experiments.
- Help build or integrate tooling and dashboards that make AI performance easy to understand and monitor.
Who You Are- 1-4 years in data science, analytics, applied ML, or a closely adjacent role.
- Strong Python and SQL skills.
- Hands-on experience building or evaluating LLM-powered applications through work, research, school, hackathons, or side projects.
- You're curious about how AI systems work beyond the model itself, including retrieval, prompting, evaluation, and production behavior.
- You're comfortable communicating analytical findings clearly to engineers, product managers, and operational stakeholders.
- You are excited about learning rapidly in a field where best practices continue to evolve.
- Bonus: Experience with evaluation tooling, A/B testing frameworks, production model monitoring, healthcare, or other operations-heavy environments.
A note on reporting structure This is a new function at Ro, and we're being deliberate about not over-defining it. Your manager and where you sit on the org chart will depend on the specific shape of the team we end up with. We'd rather find the right people and figure out the lines around them than pre-draw boxes and miss great candidates. If that ambiguity is a deal-breaker, this isn't the right role; if it sounds like an opportunity, we want to talk.
The target base salary for this position ranges from $149,600 - $184,000, in addition to a competitive equity and benefits package (as applicable). When determining compensation, we analyze and carefully consider several factors, including location, job-related knowledge, skills and experience. These considerations may cause your compensation to vary.