No Priors: Artificial Intelligence | Technology | Startups cover image

State Space Models and Real-time Intelligence with Karan Goel and Albert Gu from Cartesia

No Priors: Artificial Intelligence | Technology | Startups

NOTE

Language Understanding is Key for Multimodal Models

Language understanding is crucial for developing accurate ASR and TTS systems. Achieving perfect TTS or speech-to-speech requires models with a deep understanding of language, as it is not an isolated component. Multimodal models are essential even for excelling in a single modality. By focusing on multimodality, advancements in ASR and TTS can be made, paving the way for incorporating other modalities into the models.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner