Location: San Francisco, CA
Work Model: In-person
Industry: AI training data infrastructure
Compensation: $140K-$250K base, plus equity
The OpportunityThis is the company's top hiring priority and a genuinely hard research problem. Because data flows through a decentralized marketplace, ensuring quality at scale is the single biggest bottleneck to growth. As a Research Engineer, you will build the automated systems that verify and assure data quality so that suppliers consistently deliver excellent data to buyers.
You will start by digging into the data manually to understand failure modes, then design systems to automate quality checks at scale, combining rule-based approaches with AI for fuzzier cases and human-in-the-loop review where it makes sense. This is fundamentally a research role focused on building automated systems, not manual QA.
Responsibilities- Identify data quality issues including inconsistencies, formatting problems, and ingestion challenges
- Perform initial manual data quality review to deeply understand failure modes
- Build systems to automate quality checks at scale using rule-based and AI-driven approaches
- Design hybrid systems that balance automation with human-in-the-loop review where appropriate
- Continuously improve verification methods as the data landscape and AI tooling evolve
Requirements- Deeply technical, with a strong learning slope and the ability to ramp quickly in a fast-moving field
- Background in AI/ML engineering, or software engineering at an AI-focused company with visible data ingestion and processing experience
- Ability to reason about likely data quality problems from first principles
- Comfortable owning ambiguous, open-ended problems end to end
- Comfortable working in person, full-time, in a San Francisco office
- Bonus: experience working with noisy or unstructured data, or judgment on when to use automation versus human-in-the-loop review