Luke Zettlemoyer's work in machine learning spans NLP and self-supervised pretraining of language models.
Discussion on the importance of scalability in machine learning models and exploring the relationship between compute resources, data quality, and algorithmic efficiency.
Encouraging new researchers to pursue unique research paths, avoid comparisons, and cultivate distinct research trajectories.
Deep dives
Researcher Interview with Luke Stettlemoyer at University of Washington and META on Building Large-Scale Language Models
Luke Stettlemoyer, a professor at the Allen School of Computer Science and Engineering at the University of Washington, discusses his groundbreaking work in machine learning and NLP, focusing on foundational work in large-scale language model pre-training. His PhD thesis on semantic parsing forms a vital part of the conversation, detailing the path from thesis work to cutting-edge research on models like Elmo. The discussion delves into topics such as open sourcing models, the future of scaling, and differences between research in academia and industry.
Personal Journey of Luke Stettlemoyer: From Early Research to PhD
Stettlemoyer's early interest in research developed during his undergraduate years at North Carolina State, where he engaged in AI for education projects. Transitioning to MIT for his PhD, he explored diverse areas like statistical relational learning before focusing on semantic parsing in NLP, leading to significant contributions in the field. The internal curiosity and a passion for understanding and pushing boundaries have been consistent elements in his research journey.
Advancements Beyond PhD: Exploring Emergent Abilities in Large-Scale Models
The discussion expands to explore emergent abilities in large-scale models, emphasizing the need to understand and leverage these emergent behaviors. Stettlemoyer transitions from focusing on PhD-driven projects to collaborative industry-academia endeavours, such as the OPT project, aims to replicate and release large language models for research purposes, fostering open science principles within a computational research framework.
Scaling Models and Future Research Directions: Balancing Compute Resources and Data Quality
The conversation delves into the importance of scalability in machine learning models, citing advancements like the Chinchilla paper and emphasizing the need for detailed empirical studies to comprehend scaling effects. As models grow in size, exploring the intricate relationship between compute resources, data quality, and algorithmic efficiency becomes crucial for future research endeavors.
Objective Functions and Personal Advice for Researchers: Emphasizing Curiosity and Individual Pathways
Stettlemoyer reflects on the multifaceted objective functions that drove his research during his PhD, highlighting scientific curiosity and a profound love for the learning experience. He advocates for embarking on unique research paths while advising new researchers to avoid excessive comparisons with others and focus on cultivating distinct research trajectories, ultimately promoting a diverse and impactful research landscape.
Balancing Academia and Industry Perspectives: Navigating Different Research Environments
The conversation navigates the subtle differences between academia and industry perspectives, acknowledging the varying resources and opportunities each setting provides. Stettlemoyer encourages young researchers to explore research independence, embrace curiosity-driven projects, and seek to carve out unique research identities amidst changing landscapes of computational research.
Luke Zettlemoyer is a Professor at the University of Washington and Research Scientist at Meta. His work spans machine learning and NLP, including foundational work in large-scale self-supervised pretraining of language models.
Luke's PhD thesis is titled "Learning to Map Sentences to Logical Form", which he completed in 2009 at MIT. We talk about his PhD work, the path to the foundational Elmo paper, and various topics related to large language models.
- Episode notes: www.wellecks.com/thesisreview/episode45.html
- Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter
- Find out more info about the show at www.wellecks.com/thesisreview
- Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
Get the Snipd podcast app
Unlock the knowledge in podcasts with the podcast player of the future.
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode
Save any moment
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Share & Export
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
AI-powered podcast player
Listen to all your favourite podcasts with AI-powered features
Discover highlights
Listen to the best highlights from the podcasts you love and dive into the full episode