The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Scaling Multi-Modal Generative AI with Luke Zettlemoyer - #650

18 snips

Oct 9, 2023

In this discussion, Luke Zettlemoyer, a University of Washington professor and Meta research manager, dives into the fascinating realm of multimodal generative AI. He highlights the transformative impact of integrating text and images, illustrating advancements like DALL-E 3. Zettlemoyer explains the significance of open science for AI development and the complexities of data in enhancing model performance. Topics also include the role of self-alignment in training and the future of multimodal AI amidst rising technology costs and the need for better assessment methods.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

LLMs as Complex Systems

LLMs have shifted AI research from engineering to complex systems science.
Researchers now study emergent behaviors of LLMs, similar to natural scientists.

INSIGHT

Data Limitations and Multimodal Models

Text-only LLMs will eventually run out of training data.
Multimodal models offer a solution by incorporating diverse data like images and videos.

INSIGHT

Dolly 3 and Multimodal Learning

Dolly 3's image generation demonstrates a deeper understanding of spatial relationships and compositionality in text.
This suggests that multimodal models learn information not present in text alone.

Get the Snipd Podcast app to discover more snips from this episode

Get the app