

Building the Next Generation of Conversational AI
202 snips Mar 14, 2025
Ankit Kumar, Co-founder and CTO of Sesame AI, dives into the cutting-edge world of conversational AI. He discusses the technical hurdles of real-time speech generation and the balance between personality and efficiency in AI interactions. The conversation highlights the impact of open-sourcing their speech model and the significance of full-duplex conversation modeling. Kumar also explores the evolution of natural language as a user interface and its implications for redefining human-computer interaction, offering insights into innovation and user experience.
AI Snips
Chapters
Transcript
Episode notes
Surprise at Reception
- Ankit Kumar was surprised by the positive reception of Sesame's research preview.
- He knew it had potential but underestimated its impact due to continuous development and internal testing.
Qualitative Evaluation Challenge
- Evaluating qualitative user experiences in machine learning is challenging, especially when aiming for natural human-computer interaction.
- While quantitative metrics help, qualitative feedback loops are crucial for iterative improvement.
Prioritization is Key
- Startups, especially with limited resources, must prioritize key product aspects.
- Sesame focused on natural voice interaction, trading off advanced reasoning for a more human-like experience.