Audio Solutions Architect

Innodata Inc.

$150K — $230K *
US-AnywhereRemote in United States
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in speech/audio AI or a related field.
  • Expertise in TTS, ASR, and speech-to-speech system training.
  • Experience in solutions engineering or technical presales roles.
  • Strong writing and communication skills for public engagement.
  • Ability to define and articulate complex technical specifications for varied audiences.
  • Familiarity with audio specifications and quality metrics like LUFS, WER, and DER.
  • Experience producing technical content such as white papers and blogs.

Responsibilities

  • Collaborate with customers to identify model objectives and data requirements.
  • Define acoustic specifications, language coverage, and quality assurance targets.
  • Translate customer requirements into actionable execution plans with timelines and pricing.
  • Serve as a technical authority in presales discussions, explaining methodology choices.
  • Create reusable solution assets like scoping frameworks and templates.
  • Stay current on speech AI developments and articulate future data needs.
  • Produce and present go-to-market materials and represent Innodata at industry events.

Benefits

  • Hybrid work environment.
  • Opportunity for thought leadership in a cutting-edge field.
  • Engagement with the speech AI community at conferences and events.
  • Possibility to shape the future of audio AI data solutions.
Full Job Description
Scope of the Role:

Innodata builds the high-quality voice and audio datasets that power the world's leading speech AI - text-to-speech, speech recognition, and the new generation of speech-to-speech and conversational voice models. We're hiring an Audio Solutions Architect to be both the technical partner to our customers in presales and the external technical voice of our audio practice.

This is a hybrid role with two equally weighted halves. In presales, you sit with a frontier lab or enterprise team, understand what they're trying to train, and shape the data collection program that gets them there. In thought leadership, you keep us at the frontier of speech AI - producing go-to-market research and content, speaking at conferences, and establishing Innodata as the most technically credible audio data partner in the market. The two reinforce each other.

What You'll Own:
Presales & solutioning
  • Partner with customers in presales to understand their model objectives, current data gaps, and technical constraints.
  • Shape requirements: define acoustic specs, language/accent coverage, speaker demographics, emotional/paralinguistic range, transcript and metadata schema, and QA targets (WER/DER, LUFS, etc.).
  • Translate requirements into scoped execution plans - volumes, timelines, methodology, pricing inputs - in partnership with delivery.
  • Serve as the credible technical voice in the room: explain tradeoffs (studio vs. real-world vs. telephonic, scripted vs. spontaneous, single vs. multi-speaker) and defend methodology choices.
  • Build reusable solutioning assets: scoping frameworks, spec templates, reference architectures for common audio data use cases.
Thought leadership & GTM
  • Stay at the tip of the spear on speech-AI developments (TTS, ASR, speech-to-speech) and what data the next generation of models will need.
  • Produce go-to-market material: technical blog posts, white papers, benchmark reports, and reference content that demonstrates Innodata's depth.
  • Represent Innodata externally: speak at and work conferences (Interspeech, ICASSP, industry events), engage the speech-AI community, and build our public technical profile.
  • Feed market intelligence back into strategy - advise on emerging data categories and where to invest ahead of demand.

You'll Thrive in This Role If You Have:
  • Deep working knowledge of speech/audio AI: how TTS, ASR, and speech-to-speech systems are trained and evaluated, and what data they require.
  • Experience in a solutions engineering, solutions architect, technical presales, or applied/forward-deployed role - or a technical audio/speech background plus strong commercial instincts.
  • Demonstrated ability (and appetite) to produce public-facing technical content and represent a company externally - writing, speaking, or community engagement.
  • Ability to shape ambiguous requirements into precise specs and communicate them to both researchers and business stakeholders.
  • Strong presence and persuasion; comfortable being the technical authority in a sales conversation and on a conference stage.
  • Familiarity with audio technical specifications (sample rates, LUFS, formats), transcript/metadata schemas, and quality metrics (WER, DER).
  • A public body of work in speech/audio: talks, papers, blog posts, benchmarks.
  • Hands-on experience with speech datasets, annotation, or audio production.
  • Background working with or at a frontier AI lab or voice-AI product company.
  • Multilingual / localization exposure.

The expected salary range for this position is $150,000 - $230,000 USD per year, based on experience, skills, and qualifications.

Similar Jobs

More Jobs at Innodata Inc.

More Information Technology Jobs

Find similar Audio Solutions Architect jobs: