About the Role:You will be a vital member of our ML Data Team - which leads the full spectrum of video-language data preparation and model evaluation. This role comes with high ownership and includes responsibilities such as defining dataset needs and requirements in consultation with our research and product teams; designing and building data pipelines; and driving our post-training model evaluation strategy. You will also be responsible for automating as much of the repetitive partnership, annotation, and quality evaluation work as possible. A desire to work cross functionally and to build relationships is critical for success in this position.
You will:- Model Evaluation: Design and build robust model evaluation frameworks, automating repetitive processes and maintaining a balanced approach to efficiency and depth in obtaining evaluation metrics and feedback.
- Portfolio Monitoring: Manage resource allocation and timelines, adjusting direction flexibly based on real-time information across all data streams in your product vertical.
- External Partner Collaboration: Enhance dataset and process quality through seamless collaboration with vendors and outsourcing partners.
- Data Quality & Tooling Advancement: Establish labeling guidelines, monitor data quality, and improve tools and infrastructure to build a sustainable data operations framework.
- Internal Collaboration: Partner with Engineering and AI Model teams to align on top priority data needs, design tools such as analytical reports and dashboards, and clearly communicate project progress.
You may be a good fit if you have:- 5+ years of experience working in an AI focused data operations organization.
- A proven track record designing and executing large scale data or evaluation projects, including gathering, labeling, and post-processing data.
- The ability to analyze messy and complex data, identify overarching patterns, and distill your findings into crisp annotation guidelines or model quality reports.
- Proficiency with Python, LLMs, or other popular industry tools for automation.
- Excellent communication and project management skills, and the ability to support several projects simultaneously.
- A foundational understanding of and interest in LLMs/VLMs and multimodal AI.
- Conviction that data is the key ingredient for the performance and assessment of AI models.
You'll stand out if you have:- Experience in data collection and labeling for multimodal language models.
- Experience in red teaming, localization testing, or other evaluation focused fields.
- Experience working with research scientists and engineers.
- Expertise or interest in video-centric domains, such as sports, advertising, and content creation.
Tech Stack:- Development & Analysis: Python (primarily pandas, Jupyter, etc.)
- Data Management & Visualization: Amazon S3, Various data visualization tools (framework-agnostic)
- Project Management Tools: Linear, Notion
Even if there are a few checkboxes that aren't ticked through your prior experience, we still encourage you to apply! If you are a 0-1 achiever, a ferocious learner, and a kind and fun team player who motivates others, you will find a home at TwelveLabs.
Benefits and Perks: An open and inclusive culture and work environment.
Work closely with a collaborative, mission-driven team on cutting-edge AI technology.
🦷 Full health, dental, and vision benefits.
Flexible PTO and parental leave policy. Office closed the week of Christmas and New Years.