Building the Next Generation of Conversational AI

213 snips

Mar 14, 2025

Ankit Kumar, Co-founder and CTO of Sesame AI, dives into the cutting-edge world of conversational AI. He discusses the technical hurdles of real-time speech generation and the balance between personality and efficiency in AI interactions. The conversation highlights the impact of open-sourcing their speech model and the significance of full-duplex conversation modeling. Kumar also explores the evolution of natural language as a user interface and its implications for redefining human-computer interaction, offering insights into innovation and user experience.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Surprise at Reception

Ankit Kumar was surprised by the positive reception of Sesame's research preview.
He knew it had potential but underestimated its impact due to continuous development and internal testing.

INSIGHT

Qualitative Evaluation Challenge

Evaluating qualitative user experiences in machine learning is challenging, especially when aiming for natural human-computer interaction.
While quantitative metrics help, qualitative feedback loops are crucial for iterative improvement.

ADVICE

Prioritization is Key

Startups, especially with limited resources, must prioritize key product aspects.
Sesame focused on natural voice interaction, trading off advanced reasoning for a more human-like experience.

Get the Snipd Podcast app to discover more snips from this episode

Get the app