The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) cover image

More Language, Less Labeling with Kate Saenko - #580

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

00:00

Exploring the Convergence of Vision and Language in Multimodal Learning

This chapter explores the integration of vision and language within multimodal machine learning, highlighting historic advancements such as audio-visual speech recognition. It showcases key innovations and models like DALI 2 and Clip, emphasizing the role of unlabelled data in improving performance, especially in zero-shot learning contexts.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app