Luke Zettlemoyer, a professor at University of Washington and a research manager at Meta, discusses multimodal generative AI, visual grounding and embodiment in text-based models, advantages of discretization tokenization in image generation, self-alignment with instruction backtranslation, generalizability of language models, model performance and evaluation, the importance of open source and open science, and the future direction of multimodal AI.
Read more
AI Summary
Highlights
AI Chapters
Episode notes
auto_awesome
Podcast summary created with Snipd AI
Quick takeaways
Multimodal generative AI allows models to process multiple modalities and studying their behavior is crucial for understanding their capabilities.
The shift to large language models requires substantial resources and the field has become more like complex system science, with the emergent behavior of models still not fully understood.
Deep dives
Luke's background and interest in models
Luke Zetelmoyer is a professor at the University of Washington and a research manager at META. He has been fascinated by the limit of what models can do and is interested in studying their behavior and making them more usable.
The impact of large language models
The popularity of large language models has changed the resources required for research. Training these models now requires substantial teams and supercomputing efforts. The field has shifted from being more engineering-based to being more like complex system science. While we understand how the individual components of models work, their emergent behavior is still not fully understood.
Multimodal approach and the importance of data
Luke's research focuses on multimodal models and the effect of data on training and behavior. Models now have the ability to process multiple modalities, such as text and images. Increasing the amount of data, including multilingual and multimodal data, has become an important aspect of studying these models and understanding their capabilities.
The significance of open source and open science
Luke emphasizes the importance of open source and open science for the field. Sharing ideas and research publicly fosters collaboration, scrutiny, and progress. It allows more people to participate and creates opportunities for democratizing access to models and advancing the field.
Today we’re joined by Luke Zettlemoyer, professor at University of Washington and a research manager at Meta. In our conversation with Luke, we cover multimodal generative AI, the effect of data on models, and the significance of open source and open science. We explore the grounding problem, the need for visual grounding and embodiment in text-based models, the advantages of discretization tokenization in image generation, and his paper Scaling Laws for Generative Mixed-Modal Language Models, which focuses on simultaneously training LLMs on various modalities. Additionally, we cover his papers on Self-Alignment with Instruction Backtranslation, and LIMA: Less Is More for Alignment.
The complete show notes for this episode can be found at twimlai.com/go/650.
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode