User Researcher, AI Evaluations

Notion • $196K — $230K *

New York, NY 10025In-Person

Consumer Technology

5 - 7 years of experience

Today

Be an Early Applicant

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years of experience in UX research
Strong background in translating qualitative insights into measurable guidelines
Fluency in AI products and understanding of user experience in AI contexts
Proficient in both quantitative and qualitative research methods
Excellent communication skills with a focus on creating actionable insights

Responsibilities

Define clear frameworks and rubrics for evaluating AI-powered experiences
Conduct recurring studies to measure quality over time
Ensure evaluations align with user workflows and intentions
Identify system breakdowns and propose actionable guardrails
Collaborate with cross-functional teams to build scalable evaluation processes

Benefits

Flexible office work schedule with designated Anchor Days
Opportunities for professional growth and collaboration in a fast-paced environment
Access to competitive equity and compensation packages
Culture of innovation encouraging the use of AI as a collaborative tool
Supportive team atmosphere focused on user-centered design

Full Job Description

About the Role:

We're seeking an experienced UX Researcher to define and scale how we evaluate Notion's AI-powered experiences-focusing on what "good" looks like not only for model output quality, but for the end-to-end product experience where people discover, set goals, delegate work, review results, and build trust over time with AI.

This role sits at the intersection of research craft and evaluation operations: you'll run studies that uncover user mental models, expectations, and failure/recovery behaviors, then translate those insights into reusable rubrics, workflows, and measurement approaches that product, design, engineering, and data science can apply consistently.

This role can be based in either San Francisco or New York City. We work from our offices on Mondays, Tuesdays and Thursdays (our Anchor Days) because we do our best thinking and building together in person. We're looking for someone who's excited to work alongside the team during those days.

What You'll Achieve:

Define what "good" looks like (frameworks & rubrics): Establish clear, reusable evaluation criteria that reflect real user expectations-helpfulness, trust, tone, control, and transparency. You'll translate qualitative insight into scoring guidance that can be applied consistently across teams and over time.
Run recurring evals (longitudinal & feature-specific): Run recurring longitudinal and feature-specific surveys and studies to measure experience quality over time against defined rubrics. Lead qualitative studies, side-by-side comparisons, and human-in-the-loop evaluation efforts to deepen understanding of where experiences break down and how they can improve. You'll help teams spot regressions, benchmark improvements, and understand when expectations shift.
Anchor evaluation in real workflows (context > isolated feedback): Ensure evals reflect jobs-to-be-done, user intent, and the full interaction journey (goal setting, delegation, review, iteration), not just decontextualized thumbs up/down. You'll help teams understand who is evaluating, what they're trying to do, and why outputs succeed or fail.
Identify failure modes & recovery behavior (guardrails): Uncover breakdowns, regressions, and edge cases across the system-from model behavior to UI and integrations-and study how people notice issues, correct them, and continue their work. You'll turn these insights into actionable guidance for guardrails, fixes, and prioritization.
Operationalize evaluation with partners (process & tooling): Collaborate closely with Product, Design, Engineering, and Data Science to align on target use cases and build scalable evaluation loops (human-in-the-loop review, longitudinal studies, and calibration of automated/LLM-judge approaches against human judgment).

Skills You'll Need to Bring:

Ability to operationalize insight into measurement: You're comfortable turning "soft" user expectations (trust, tone, usefulness, clarity) into concrete rubrics, scoring guidelines, and observable metrics.
AI fluency and systems thinking: You're curious and hands-on with AI products, and can reason about how model behavior, uncertainty, and system constraints shape user experience. You also have experience evaluating AI-enabled products (LLMs, agents, generative UI/workflow automation) and working with Data Science/ML partners on measurement strategy and evaluation tooling.
Clear communication and impact orientation: You can align diverse partners around shared definitions of quality and create artifacts that enable teams to act consistently. You tailor storytelling to different audiences, connect research to business outcomes, and drive follow-through so insights translate into product change.
Strong UX research craft (quant + qual): You can choose the right methods for the question- interviews, benchmarking, surveys, experiments-and synthesize into actionable guidance. You also can prioritize ruthlessly, work through ambiguity, and balance scrappy iteration with deep dives when needed.
Pragmatism in fast-moving environments: You can prioritize ruthlessly, work through ambiguity, and balance scrappy iteration with deep dives when needed.
Experience: 5+ years doing UX research in industry

Nice to Haves:

Familiarity with LLM-as-judge methods, prompt design for evaluators, or "golden dataset" creation
Experience using AI research tooling for rapid synthesis and communication (e.g., Dovetail, Listen Labs, Maze, Outset, etc.), as well as AI observability tooling like Braintrust
Experience using data querying languages (e.g., SQL), scripting languages (e.g., Python), or statistical/mathematical software (e.g., R, SAS, Matlab, etc.)
Master's or PhD in HCI, Psychology, Behavioral Science, Anthropology, Sociology, or a related field
You're familiar with the work of computing heroes like Douglas Engelbart, Alan Kay, Bret Victor, etc. - and understand why we're big fans.

Notion is committed to providing highly competitive cash compensation, equity, and benefits. The compensation offered for this role will be based on multiple factors such as location, the role's scope and complexity, and the candidate's experience and expertise, and may vary from the range provided below. For roles based in San Francisco or New York City, the estimated base salary range for this role is $196,000-$230,000 per year.

#LI-Onsite

A Note on AI

You don't need deep AI expertise for every role, but we do expect every Notino to be intellectually curious, drawn to tinkering and discovery, and excited to use AI as a real collaborator in their work. For some roles, AI fluency is a core requirement - when that's the case, we'll say so explicitly in the qualifications. People who thrive here don't treat AI as a novelty. They use it to think better, and make their work easier for others to build on.

About Notion

Notion is a software company that provides a productivity and collaboration platform for teams. The company's platform offers a range of features, including note-taking, project management, and task tracking. Notion's software is designed to help teams streamline their workflows and improve their productivity. The company was founded in 2016 and is headquartered in San Francisco, California.

Learn more about Notion

Size

300 employees

Industry

Enterprise Technology

Net Income

-$80 million

Founded

2016

Revenue

$80 million

NASDAQ

NOT

* Ladders Estimates

Similar Jobs

Senior UX Researcher, Applied AI Solution
$152K — $205K *
Amazon
Herndon, VA 20171 (Fairfax County)
Today
Senior Manager, Research & User Intelligence (Card Platforms & Data)
$182K — $208K *
Capital One Financial Corporation
Richmond, VA 23223 (Richmond City County)
1 week ago
Senior Manager, Research & User Intelligence (Card Platforms & Data)
$219K — $249K *
Capital One Financial Corporation
New York, NY 10025 (New York County)
1 week ago
Lead, User Intelligence Research
$179K — $205K *
Capital One Financial Corporation
New York, NY 10025 (New York County)
1 week ago
Senior Manager, Research & User Intelligence (Card Platforms & Data)
$200K — $229K *
Capital One Financial Corporation
Mclean, VA 22101 (Fairfax County)
1 week ago
Senior Manager, User Research
$180K — $200K *
Xometry
Waltham, MA 02453 (Middlesex County)
1 week ago

Get Ready For Your
Next Interview

More Jobs at Notion

Corporate Accounting Manager
$157K — $180K *
San Francisco, CA 94112 (San Francisco County)
Today
Legal & Accounting
In-Person
International Payroll Analyst
$120K — $135K *
New York, NY 10025 (New York County)
Yesterday
Business Services
In-Person
International Payroll Analyst
$120K — $135K *
San Francisco, CA 94112 (San Francisco County)
Yesterday
Business Services
In-Person
Technical Enablement Manager, Customer Experience
$165K — $180K *
New York, NY 10025 (New York County)
3 days ago
Information Technology
In-Person
CX Knowledge Architect
$165K — $180K *
New York, NY 10025 (New York County)
3 days ago
Business Services
In-Person

More Consumer Technology Jobs

VP / Sr. Director of New Customer Growth & Social Commerce
$150K — $200K *
Beam
New York, NY 10025 (New York County)
Today
Senior Software Developer - Ai Applications
$90K — $120K *
Vention
Montreal, QC H1A 0A1
Today
Sr. Engineer, iOS Mobile Development
$136K — $205K *
Comcast
Englewood, CO 80112 (Arapahoe County)
Today
Support Specialist
$78K — $90K *
Gem
Seattle, WA 98115 (King County)
Today
Staff Product Designer, AI Search
$148K — $222K *
Clio
Toronto, ON M3C 0E3
Today

Find similar User Researcher, AI Evaluations jobs:

Nationwide New York, NY

User Researcher, AI Evaluations

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar User Researcher, AI Evaluations jobs:

Get Ready For Your
Next Interview