[39] Burr Settles - Curious Machines: Active Learning with Structured Instances
Feb 2, 2022
auto_awesome
Burr Settles, who leads machine learning research at Duolingo, discusses his fascinating journey from studying art and math to focusing on language education through AI. He dives into how active learning methodologies can enhance language acquisition, overcoming motivation barriers with personalized educational experiences. Burr shares innovative strategies in translation, the integration of generative AI like GPT-3, and the evolution of success metrics from academia to meaningful real-world applications in tech.
Motivation is essential for language learning, and Duolingo enhances it through gamification, making the process engaging and enjoyable.
Burr Settles' work on active learning utilizes information density querying, prioritizing data points that help maximize model performance efficiently.
The application of large language models at Duolingo shows potential for personalized learning, though human oversight remains crucial for quality.
Deep dives
The Challenges of Learning a Language
Staying motivated is one of the most significant challenges in learning a new language, with parallels to developing habits in other areas, such as fitness. The podcast emphasizes that motivation is crucial because language learning is a long-term process where knowledge builds cumulatively. Duolingo addresses this challenge by gamifying the learning experience, thus providing an engaging environment that encourages users to continue their studies. Additionally, understanding the learner's background, including prior exposure to the language, can lead to a more tailored approach that sustains motivation.
The Evolution of Active Learning
Active learning is a machine learning method where a model actively queries an information source for labeling data, as discussed in Burr Settles' PhD thesis. This approach differs from traditional passive learning because it allows the model to identify data points it finds most informative and request labels for those. Information density querying is one key method utilized, which involves determining the density of data points to concentrate on those that contain the most valuable information for training. Thus, active learning is designed to minimize the resources needed for annotation while maximizing model performance.
From Academia to Duolingo: Bridging Research and Application
Burr Settles transitioned from academic research on active learning to applying these methods at Duolingo, focusing on personalizing language learning. One notable development was the creation of a computer adaptive placement test, which tailors the language-learning curriculum based on a user's proficiency. This system enhances user engagement by preventing learners from starting at an overly basic level, ensuring that the content remains relevant and appropriately challenging. This innovative adaptation reflects a merging of academic insights with real-world applications, enhancing learning outcomes.
The Role of Large Language Models in Education
Recent advancements in large language models, such as GPT-3, have opened up new possibilities for language education, including the generation of personalized content. Duolingo has experimented with these models for creating reading passages and grading test items, gaining insights into their utility for enhancing learner experience. However, the need for human oversight remains critical to ensure the quality and fairness of generated materials. By combining the strengths of machine-generated content with human expertise, language learning can be augmented effectively.
Reflecting on Research and Real-World Impact
Settles highlights the importance of shifting focus from publication metrics in academia to the tangible impact of research in practical applications, particularly in language learning. During his career at Duolingo, he has discovered that real-world testing can significantly enhance learning experiences, thereby engaging a larger user base. His advice to researchers emphasizes the need to concentrate on solving real-world problems rather than merely developing novel methodologies. This perspective fosters a more impactful approach to research, ultimately benefiting learners and educators alike.
Burr Settles leads the research group at Duolingo, a language-learning website and mobile app whose mission is to make language education free and accessible to everyone.
Burr’s PhD thesis is titled "Curious Machines: Active Learning with Structured Instances", which he completed in 2008 at the University of Wisconsin-Madison. We talk about his work in the thesis on active learning, then chart the path to Burr’s role at DuoLingo. We discuss machine learning for education and language learning, including content, assessment, and the exciting possibilities opened by recent advancements.
- Episode notes: https://cs.nyu.edu/~welleck/episode39.html
- Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter
- Find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html
- Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.