The Thesis Review cover image

The Thesis Review

Latest episodes

undefined
Jul 16, 2021 • 1h 6min

[28] Karen Ullrich - A Coding Perspective on Deep Latent Variable Models

Karen Ullrich, a Research Scientist at FAIR, studies the intersection of information theory and machine learning. She discusses her PhD work, highlighting the minimum description length principle and its impact on neural network compression. Their conversation delves into the intricate ties between data compression and cognitive processes, while exploring innovative methods for addressing imaging challenges. Ullrich also shares insights on enhancing differentiability in image reconstruction and offers practical advice for new researchers navigating complex data landscapes.
undefined
Jul 2, 2021 • 56min

[27] Danqi Chen - Neural Reading Comprehension and Beyond

Danqi Chen is an assistant professor at Princeton University, co-leading the Princeton NLP Group. Her research focuses on fundamental methods for learning representations of language and knowledge, and practical systems including question answering, information extraction and conversational agents. Danqi’s PhD thesis is titled "Neural Reading Comprehension and Beyond", which she completed in 2018 at Stanford University. We discuss her work on parsing, reading comprehension and question answering. Throughout we discuss progress in NLP, fundamental challenges, and what the future holds. Episode notes: https://cs.nyu.edu/~welleck/episode27.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
May 29, 2021 • 1h 18min

[26] Kevin Ellis - Algorithms for Learning to Induce Programs

Kevin Ellis is an assistant professor at Cornell and currently a research scientist at Common Sense Machines. His research focuses on artificial intelligence, program synthesis, and neurosymbolic models. Kevin's PhD thesis is titled "Algorithms for Learning to Induce Programs", which he completed in 2020 at MIT. We discuss Kevin’s work at the intersection of machine learning and program induction, including inferring graphics programs from images and drawings, DreamCoder, and more. Episode notes: https://cs.nyu.edu/~welleck/episode26.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
May 14, 2021 • 1h 19min

[25] Tomas Mikolov - Statistical Language Models Based on Neural Networks

Tomas Mikolov is a Senior Researcher at the Czech Institute of Informatics, Robotics, and Cybernetics. His research has covered topics in natural language understanding and representation learning, including Word2Vec, and complexity. Tomas's PhD thesis is titles "Statistical Language Models Based on Neural Networks", which he completed in 2012 at the Brno University of Technology. We discuss compression and recurrent language models, the backstory behind Word2Vec, and his recent work on complexity & automata. Episode notes: https://cs.nyu.edu/~welleck/episode25.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
16 snips
Apr 30, 2021 • 1h 3min

[24] Martin Arjovsky - Out of Distribution Generalization in Machine Learning

Martin Arjovsky is a postdoctoral researcher at INRIA. His research focuses on generative modeling, generalization, and exploration in RL. Martin's PhD thesis is titled "Out of Distribution Generalization in Machine Learning", which he completed in 2019 at New York University. We discuss his work on the influential Wasserstein GAN early in his PhD, then discuss his thesis work on out-of-distribution generalization which focused on causal invariance and invariant risk minimization. Episode notes: https://cs.nyu.edu/~welleck/episode24.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
Apr 16, 2021 • 1h 7min

[23] Simon Du - Gradient Descent for Non-convex Problems in Modern Machine Learning

Simon Du, an Assistant Professor at the University of Washington, delves into the theoretical foundations of deep learning and gradient descent. He discusses the intricacies of addressing non-convex problems, revealing challenges and insights from his research. The conversation highlights the significance of the neural tangent kernel and its implications for optimization and generalization. Simon also shares practical tips for reading research papers, drawing connections between theory and practice, and navigating a successful research career.
undefined
4 snips
Apr 2, 2021 • 1h 3min

[22] Graham Neubig - Unsupervised Learning of Lexical Information

Graham Neubig is an Associate Professor at Carnegie Mellon University. His research focuses on language and its role in human communication, with the goal of breaking down barriers in human-human or human-machine communication through the development of NLP technologies. Graham’s PhD thesis is titled "Unsupervised Learning of Lexical Information for Language Processing Systems", which he completed in 2012 at Kyoto University. We discuss his PhD work related to the fundamental processing units that NLP systems use to process text, including non-parametric Bayesian models, segmentation, and alignment problems, and discuss how his perspective on machine translation has evolved over time. Episode notes: http://cs.nyu.edu/~welleck/episode22.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at http://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
Mar 19, 2021 • 1h 8min

[21] Michela Paganini - Machine Learning Solutions for High Energy Physics

Michela Paganini is a Research Scientist at DeepMind. Her research focuses on investigating ways to compress and scale up neural networks. Michela's PhD thesis is titled "Machine Learning Solutions for High Energy Physics", which she completed in 2019 at Yale University. We discuss her PhD work on deep learning for high energy physics, including jet tagging and fast simulation for the ATLAS experiment at the Large Hadron Collider, and the intersection of machine learning and physics. Episode notes: https://cs.nyu.edu/~welleck/episode21.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
Mar 5, 2021 • 1h 25min

[20] Josef Urban - Deductive and Inductive Reasoning in Large Libraries of Formalized Mathematics

Josef Urban is a Principal Researcher at the Czech Institute of Informatics, Robotics, and Cybernetics. His research focuses on artificial intelligence for large-scale computer-assisted reasoning. Josef's PhD thesis is titled "Exploring and Combining Deductive and Inductive Reasoning in Large Libraries of Formalized Mathematics", which he completed in 2004 at Charles University in Prague. We discuss his PhD work on the Mizar Problems for Theorem Proving, machine learning for premise selection, and how it evolved into his recent research. Episode notes: https://cs.nyu.edu/~welleck/episode20.html Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter, and find out more info about the show at https://cs.nyu.edu/~welleck/podcast.html Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
undefined
Feb 19, 2021 • 1h 20min

[19] Dumitru Erhan - Understanding Deep Architectures and the Effect of Unsupervised Pretraining

Dumitru Erhan, a Research Scientist at Google Brain, dives into the fascinating world of neural networks. He discusses his groundbreaking PhD work on deep architectures and unsupervised pretraining. The conversation touches on the evolution of deep learning, the significance of regularization hypotheses, and the philosophical nuances in AI task conceptualization. Dumitru shares insights into the transition from traditional computer vision to deep neural networks and highlights the importance of unexpected outcomes in enhancing research understanding.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode