AI Evaluation Program Manager

Twelve Labs, Inc

• $120K — $160K *

San Francisco, CA 94112In-Person

Information Technology

5 - 7 years of experience

3 weeks ago

By clicking Apply, I agree with Ladders' Terms of Use and Privacy Policy

Job Overview by Ladders

Qualifications

5+ years experience in AI-focused data operations
Proven track record in large scale data or evaluation projects
Ability to analyze complex data and distill findings
Proficiency with Python, LLMs, and automation tools
Excellent communication and project management skills
Foundation in LLMs/VLMs and multimodal AI
Belief in data as crucial for AI model performance

Responsibilities

Design and build robust model evaluation frameworks
Manage resource allocation and timelines
Enhance quality via collaboration with external partners
Establish labeling guidelines and monitor data quality
Partner with internal teams to align on data needs

Benefits

Open and inclusive culture
Collaborative, mission-driven team environment
Full health, dental, and vision benefits
Flexible PTO and parental leave policy
Office closure during Christmas and New Year’s week

Full Job Description

About the Role:

You will be a vital member of our ML Data Team - which leads the full spectrum of video-language data preparation and model evaluation. This role comes with high ownership and includes responsibilities such as defining dataset needs and requirements in consultation with our research and product teams; designing and building data pipelines; and driving our post-training model evaluation strategy. You will also be responsible for automating as much of the repetitive partnership, annotation, and quality evaluation work as possible. A desire to work cross functionally and to build relationships is critical for success in this position.

You will:

Model Evaluation: Design and build robust model evaluation frameworks, automating repetitive processes and maintaining a balanced approach to efficiency and depth in obtaining evaluation metrics and feedback.
Portfolio Monitoring: Manage resource allocation and timelines, adjusting direction flexibly based on real-time information across all data streams in your product vertical.
External Partner Collaboration: Enhance dataset and process quality through seamless collaboration with vendors and outsourcing partners.
Data Quality & Tooling Advancement: Establish labeling guidelines, monitor data quality, and improve tools and infrastructure to build a sustainable data operations framework.
Internal Collaboration: Partner with Engineering and AI Model teams to align on top priority data needs, design tools such as analytical reports and dashboards, and clearly communicate project progress.

You may be a good fit if you have:

5+ years of experience working in an AI focused data operations organization.
A proven track record designing and executing large scale data or evaluation projects, including gathering, labeling, and post-processing data.
The ability to analyze messy and complex data, identify overarching patterns, and distill your findings into crisp annotation guidelines or model quality reports.
Proficiency with Python, LLMs, or other popular industry tools for automation.
Excellent communication and project management skills, and the ability to support several projects simultaneously.
A foundational understanding of and interest in LLMs/VLMs and multimodal AI.
Conviction that data is the key ingredient for the performance and assessment of AI models.

You'll stand out if you have:

Experience in data collection and labeling for multimodal language models.
Experience in red teaming, localization testing, or other evaluation focused fields.
Experience working with research scientists and engineers.
Expertise or interest in video-centric domains, such as sports, advertising, and content creation.

Tech Stack:

Development & Analysis: Python (primarily pandas, Jupyter, etc.)
Data Management & Visualization: Amazon S3, Various data visualization tools (framework-agnostic)
Project Management Tools: Linear, Notion

Even if there are a few checkboxes that aren't ticked through your prior experience, we still encourage you to apply! If you are a 0-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at TwelveLabs.

Benefits and Perks:

An open and inclusive culture and work environment.

Work closely with a collaborative, mission-driven team on cutting-edge AI technology.

🦷 Full health, dental, and vision benefits.

Flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.

* Ladders Estimates

Similar Jobs

Associate Director, Solutions
$120K — $150K *
YipitData
Remote
Reposted Today
Manager, Data Analytics
$100K — $130K *
Voldex Games
Remote
Today
Business Analytics Director
$157K — $262K *
Cigna
Remote
Today
Senior Manager, Workplan Data Platforms and Engineering
$144K — $244K *
PG&E Corporation
Pleasanton, CA 94566 (Alameda County)
Today
Associate Director, Solutions
$120K — $150K *
YipitData
Remote
Reposted Yesterday
Senior Director, Data & AI Advisory
$150K — $200K *
Acxiom
Remote
Yesterday

Get Ready For Your
Next Interview

More Jobs at Twelve Labs, Inc

Director of Partner Engineering
$130K — $180K *
Remote
5 days ago
Enterprise Technology
Remote in United States
AI Evaluation Program Manager
$120K — $160K *
San Francisco, CA 94112 (San Francisco County)
3 weeks ago
Information Technology
In-Person

More Information Technology Jobs

Business Development Director
$300K — $345K + $120K bonus *
Tier1 IT Services Firm
Kansas City, MO 64116 (Clay County)
6 days ago
Client Partner / Business Developemnt - Banking
$250K — $320K + $70K bonus *
IT Services Firm (client of TechLink Systems)
New York, NY 10001 (New York County)
6 days ago
Développeur Python (H/F)
$80K — $110K *
Extia
Montreal, QC H1A 0A1
Reposted Today
Information Security Threat Management Specialist
$95K — $144K *
Bank of America Corporation
Denver, CO 80219 (Denver County)
Today
Senior IT & Security Engineer
$100K — $130K *
MirrorWeb
Austin, TX 78745 (Travis County)
Today

Find similar AI Evaluation Program Manager jobs:

Nationwide San Francisco, CA

AI Evaluation Program Manager

Job Overview by Ladders

Full Job Description

Get Ready For Your Next Interview

Find similar AI Evaluation Program Manager jobs:

Get Ready For Your
Next Interview