The Thesis Review

Sean Welleck

Each episode of The Thesis Review is a conversation centered around a researcher's PhD thesis, giving insight into their history, revisiting older ideas, and providing a valuable perspective on how their research has evolved (or stayed the same) since.

Episodes

Mentioned books

Oct 28, 2024 • 46min

[48] Tianqi Chen - Scalable and Intelligent Learning Systems

Tianqi Chen, an Assistant Professor at Carnegie Mellon University and Chief Technologist at OctoML, shares insights from his impactful career in machine learning. He discusses his groundbreaking work on XGBoost, an influential optimization library that transformed data science competitions. The conversation delves into deep learning frameworks like MXNet and TVM, highlighting their role in modern generative AI. Chen also reflects on the synergy between machine learning and systems research, emphasizing the importance of scalability and efficiency in today's complex models.

Oct 15, 2024 • 1h 17min

[47] Niloofar Mireshghallah - Auditing and Mitigating Safety Risks in Large Language Models

Niloofar Mireshghallah is a postdoctoral scholar at the University of Washington. Her research focuses on privacy, natural language processing, and the societal implications of machine learning. Niloofar completed her PhD in 2023 at UC San Diego, where she was advised by Taylor Berg-Kirkpatrick. Her PhD thesis is titled "Auditing and Mitigating Safety Risks in Large Language Models." We discuss her journey into research and her work on privacy and LLMs, including how privacy is defined, common attacks and mitigations, differential privacy, and the balance between memorization and generalization. - Episode notes: www.wellecks.com/thesisreview/episode47.html - Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter - Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview

Aug 12, 2023 • 60min

[46] Yulia Tsvetkov - Linguistic Knowledge in Data-Driven NLP

Yulia Tsvetkov is a Professor in the Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on multilingual NLP, NLP for social good, and language generation. Yulia's PhD thesis is titled "Linguistic Knowledge in Data-Driven Natural Language Processing", which she completed in 2016 at CMU. We discuss getting started in research, then move to Yulia's work in the thesis that combines ideas from linguistics and natural language processing. We discuss low-resource and multilingual NLP, large language models, and great advice about research and beyond. - Episode notes: www.wellecks.com/thesisreview/episode46.html - Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter - Find out more info about the show at www.wellecks.com/thesisreview - Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview

Jul 25, 2023 • 60min

[45] Luke Zettlemoyer - Learning to Map Sentences to Logical Form

Luke Zettlemoyer is a Professor at the University of Washington and Research Scientist at Meta. His work spans machine learning and NLP, including foundational work in large-scale self-supervised pretraining of language models. Luke's PhD thesis is titled "Learning to Map Sentences to Logical Form", which he completed in 2009 at MIT. We talk about his PhD work, the path to the foundational Elmo paper, and various topics related to large language models. - Episode notes: www.wellecks.com/thesisreview/episode45.html - Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter - Find out more info about the show at www.wellecks.com/thesisreview - Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview

Aug 23, 2022 • 1h 6min

[44] Hady Elsahar - NLG from Structured Knowledge Bases (& Controlling LMs)

Hady Elsahar is a Research Scientist at Naver Labs Europe. His research focuses on Neural Language Generation under constrained and controlled conditions. Hady's PhD was on interactions between Natural Language and Structured Knowledge bases for Data2Text Generation and Relation Extraction & Discovery, which he completed in 2019 at the Université de Lyon. We talk about his phd work and how it led to interests in multilingual and low-resource in NLP, as well as controlled generation. We dive deeper in controlling language models, including his interesting work on distributional control and energy-based models. - Episode notes: www.wellecks.com/thesisreview/episode44.html - Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter - Find out more info about the show at www.wellecks.com/thesisreview - Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview

Jun 28, 2022 • 1h 6min

[43] Swarat Chaudhuri - Logics and Algorithms for Software Model Checking

Swarat Chaudhuri, an Associate Professor at the University of Texas, delves into the fascinating intersection of programming languages and machine learning. He discusses the evolution of formal verification and the integration of model checking within AI systems. The conversation highlights advancements in neurosymbolic programming, enhancing reliability in software. Swarat also provides insights on developing reusable modules and emphasizes the importance of practical contributions in research, especially in AI safety and real-world applications.

Apr 19, 2022 • 1h 18min

[42] Charles Sutton - Efficient Training Methods for Conditional Random Fields

In this conversation with Charles Sutton, a Research Scientist at Google Brain and an Associate Professor at the University of Edinburgh, the focus is on his innovative work in deep learning. He discusses the evolution from structured models like Conditional Random Fields to today's powerful language models. The conversation delves into program synthesis with CrossBeam's unique methods, and the challenges of model training. Sutton also reflects on the unpredictability of academic journeys and the importance of mentorship in research.

Mar 30, 2022 • 1h 19min

[41] Talia Ringer - Proof Repair

In this discussion, Talia Ringer, an Assistant Professor at the University of Illinois Urbana-Champaign, shares her insights on formal verification and proof repair. She explains how proof repair automates the maintenance of formal proofs as software evolves. Talia also discusses the intersection of machine learning with proof engineering and highlights her academic journey from mathematics to software engineering. The conversation dives into the complexities of program verification and the role of theory in problem-solving within software development.

Mar 9, 2022 • 47min

[40] Lisa Lee - Learning Embodied Agents with Scalably-Supervised RL

Lisa Lee is a Research Scientist at Google Brain. Her research focuses on building AI agents that can learn and adapt like humans and animals do. Lisa's PhD thesis is titled "Learning Embodied Agents with Scalably-Supervised Reinforcement Learning", which she completed in 2021 at Carnegie Mellon University. We talk about her work in the thesis on reinforcement learning, including exploration, learning with weak supervision, and embodied agents, and cover various topics related to trends in reinforcement learning. - Episode notes: https://cs.nyu.edu/~welleck/episode40.html - Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter - Find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html - Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview

Feb 2, 2022 • 1h 7min

[39] Burr Settles - Curious Machines: Active Learning with Structured Instances

Burr Settles, who leads machine learning research at Duolingo, discusses his fascinating journey from studying art and math to focusing on language education through AI. He dives into how active learning methodologies can enhance language acquisition, overcoming motivation barriers with personalized educational experiences. Burr shares innovative strategies in translation, the integration of generative AI like GPT-3, and the evolution of success metrics from academia to meaningful real-world applications in tech.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner