Dialpad

AI Engineer

Dialpad$145K — $172K *
Information Technology
Less than 5 years of experience
Job Overview by Ladders

Qualifications

  • 5-7 years of experience in Speech Synthesis (TTS) or Voice Design, including work with frameworks like NVIDIA NeMo and major TTS APIs.
  • Strong Python programming skills and proficiency with deep learning frameworks (e.g. PyTorch).
  • Degree in Computational Linguistics, Computer Science, or AI/ML with a strong foundation in phonetics and syntax.
  • Proven experience in prompt engineering for LLMs, including crafting system and few-shot prompts.
  • Experience building production-grade APIs in a cloud environment, with GCP preferred.
  • Knowledge of speech quality metrics and ability to conduct A/B testing for evaluation.

Responsibilities

  • Own the integration and optimization of multiple TTS vendor APIs and prototype new solutions.
  • Apply expertise in phonetics and sociolinguistics for TTS input formatting.
  • Craft context-specific utterances to enhance conversational trust and flow.
  • Manage LLM and TTS prompts to optimize voice agent personalities across industries.
  • Expose adjustable voice attributes to the product UI for customer personalization.
  • Collaborate with ASR and Audio AI engineers for seamless voice quality through the processing pipeline.

Benefits

  • Competitive salary and comprehensive benefits package.
  • Access to cutting-edge AI tools and a robust training program.
  • Vibrant and inclusive work environment promoting collaboration.
  • Recognized culture as a great place to work, empowering employees.
Full Job Description
Your RoleAs an AI Engineer: Voice Designer, you'll own the back-end implementation and linguistic optimization of the Text-to-Speech (TTS) layer for our next-generation AI voice agents. You'll work squarely within our Speech Team-a high-impact R&D and engineering group focused on speech recognition, enhancement, and synthesis. You will bridge the gap between core speech science and product engineering, ensuring our voice agents sound human, context-aware, and trustworthy. You'll also help create the systems that manage voice personas, tone, and conversational fillers, eventually exposing these as tweakable parameters to our customer-facing UI.

This position reports to our Senior Manager, AI Speech, is based at our Kitchener hub, and operates on a hybrid schedule.

What You'll Do
  • TTS Backend Implementation: Own the integration and optimization of multiple TTS vendor APIs while leading research and prototyping for open-source or in-house TTS architectures.
  • Linguistic Optimization: Apply expertise in phonetics and sociolinguistics to ensure TTS input is formatted for maximum naturalness, including SSML orchestration and pronunciation handling.
  • Conversational Turn Design: Craft context-specific utterances to optimize turn handling and build caller trust during agentic "thought" processes.
  • Prompt & Persona Management: Design and manage LLM and TTS prompts and parameters to define and refine agent personalities across different industry verticals.
  • UI Parameter Exposure: Architect the logic to expose voice attributes (speed, pitch, tone, style) to the product UI, allowing customers to customize their agent's voice profile.
  • Cross-Functional R&D: Partner with ASR and Audio AI engineers to ensure end-to-end voice quality and minimize latency in the ASR → LLM → TTS pipeline.

Skills You'll Bring
  • Technical Foundation: Strong Python programming skills and experience with deep learning frameworks (e.g. PyTorch).
  • Speech Expertise: 3+ years of experience in Speech Synthesis (TTS) or Voice Design, including hands-on work with frameworks like NVIDIA NeMo, ESPnet, or Coqui, and hands-on experience with major TTS APIs such as ElevenLabs, Rime, and Cartesia.
  • Linguistic Background: Degree in Computational Linguistics, Computer Science, or AI/ML with a deep understanding of phonetics, prosody, and syntax.
  • Prompt Engineering: Proven experience crafting and evaluating LLM prompts (system, few-shot) and managing structured prompt templates.
  • Backend Engineering: Experience building production-grade APIs and integrating multi-vendor services in a cloud environment (GCP preferred).
  • Evaluation Mindset: Knowledge of speech quality metrics (MOS, intelligibility, latency) and the ability to design rigorous A/B tests for voice personas.


For exceptional talent based in Ontario, Canada the target base salary range for this position is posted below. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the target range for new hire salaries for the position. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in Ontario role postings reflect the base salary only, and do not include bonus, equity, or benefits.

Ontario Pay Transparency Range

$145,000-$172,500 CAD

Why Join Dialpad
  • Work at the center of the AI transformation in business communications
  • Build and ship agentic AI products that are redefining how companies operate
  • Join a team where AI amplifies every employee's impact
  • Competitive salary, comprehensive benefits, and real opportunities for growth

We believe in investing in our people. Dialpad offers competitive benefits and perks, cutting-edge AI tools, and a robust training program that help you reach your full potential. We have designed our offices to be inclusive, offering a vibrant environment to cultivate collaboration and connection. Our exceptional culture, repeatedly recognized as a Great Place to Work, ensures that every employee feels valued and empowered to contribute to our collective success.

Don't meet every single requirement? If you're excited about this role and possess the fundamental traits, drive, and strong ambition we seek, but your experience doesn't meet every qualification, we encourage you to apply.

About Dialpad

Dialpad is a cloud-based communication platform that provides voice, video, and messaging services to businesses. The company was founded in 2011 and is headquartered in San Francisco, California. Dialpad's platform integrates with other business applications, such as Salesforce and G Suite, to provide a seamless communication experience for users.
Learn more about Dialpad
Size
1,000 employees
Industry
Founded
2011

Similar Jobs

More Jobs at Dialpad

More Information Technology Jobs

Find similar AI Engineer jobs: