Advancements in Real-Time Dialogue with MOSHI

This chapter focuses on the development of MOSHI, a speech-based model for real-time dialogue that allows full duplex communication. It highlights the historical evolution of speech recognition technologies and the innovative breakthroughs that make effective audio processing possible. The speakers also discuss the implications of this technology and its applications, alongside unique training methodologies using varied datasets.

Play episode from 19:32

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app