Founding Engineer, Voice AI + Mobile

ITCO Solutions, Inc.

$120K — $180K *
Consumer Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5+ years of experience with conversational speech models or voice-agent infrastructure.
  • Proven mobile development skills, ideally with React Native.
  • Expertise in addressing engineering challenges related to voice, such as latency and transcription accuracy.
  • Experience shipping independent projects quickly and without extensive specifications.
  • Desire for ownership and influence in a founding engineering role.

Responsibilities

  • Own the end-to-end conversational speech stack for Kosha.
  • Build with modern speech and conversational AI tools and make strategic tool decisions.
  • Transform challenging audio inputs from real-world environments into structured data.
  • Interact directly with early customers to tailor the product based on real feedback.
  • Shape the engineering architecture, tooling, and culture from the ground up.

Benefits

  • Flexible work environment with in-person or remote options.
  • Opportunity to take part in the foundational growth of an innovative startup.
  • Engagement with cutting-edge voice technology in a real-world application.
  • Direct influence over product direction and engineering decisions.
Full Job Description

#LI-BS1
#LI-IA1 Kosha is building the voice-native operating system for the physical world of commerce. We believe that over the next decade, voice will become the primary user interface. Today's advanced voice-enabled workflows are designed for knowledge workers, people that sit at desks with access to their laptops, not field workers. We see a massive whitespace opportunity for those in the field, people constantly on-the-go, having in-person conversations, and for whom talking is already the natural way they work.
Our first wedge is Client field sales. Every day, hundreds of thousands of field reps walk into stores, have conversations, and make judgment calls, and almost none of that intelligence is ever captured. The desktop CRM was never built for someone standing in an aisle. We think the interface for that person is not a screen - it's their voice. We capture in-person conversations (today with debriefs, tomorrow with ambient voice capture, and in the future with wearables) and turn them into structured CRM data and market intelligence, building, visit by visit, a proprietary dataset of what is actually happening on shelves that incumbents like Nielsen and Circana cannot see. A real-time knowledge graph of what's happening on the ground, an entirely new dataset for brands, distributors, and investors alike.
We are a pre-seed team that includes the CEO (ex H.I.G. Capital, J.P. Morgan, Harvard Business School) and CTO (ex Field Sales Team at Salesforce, Optimism / OP Labs, U.C. Berkeley Computer Science).
We are looking for our founding engineer to own the layer that makes all of this work.
What you'll do:
Own Kosha's conversational speech stack end to end, including capture, transcription, diarization, structuring, and the latency and accuracy tradeoffs that make voice feel effortless
  • Build with and around modern speech and conversational-AI tooling such as Deepgram, ElevenLabs, Sesame, and Vapi, and decide what to buy, what to wrap, and what to build ourselves
  • Turn messy real-world audio, including noisy stores, accents, jargon, and interruptions, into clean, structured intelligence, especially on mobile applications
  • Work directly with early pilot customers and ship based on what you hear
  • Shape architecture, tooling, and engineering culture from day one
You might be a fit if you:
  • Have built production systems with conversational speech models or voice-agent infrastructure such as Deepgram, ElevenLabs, Sesame, Vapi, Whisper, Retell, or LiveKit, or similar
  • Mobile development skiils with exposure to React Native
  • Understand the real engineering problems of voice, including latency, streaming, turn-taking, and transcription accuracy in hostile audio environments
  • Have shipped independently and can move fast without a spec
  • Want the upside, ownership, and defining influence of a true founding role
The best voice products feel like magic and are brutally hard underneath. If that gap is the kind of problem you want to live in, let's talk. This will be a fun ride - we will learn a lot.
This role is in-person in NYC or remote.
#LI-BP1
#LI-NB1
#LI-AP1
#LI-DM1
#LI-PT1
#LI-NT1
#LI-SG1
#LI-RB1

Similar Jobs

More Jobs at ITCO Solutions, Inc.

More Consumer Technology Jobs

Find similar Founding Engineer, Voice AI + Mobile jobs: