
Hattie Zhou
PhD student at Université de Montréal and Mila, and part-time researcher at Google Brain. Her research focuses on teaching algorithmic reasoning to large language models.
Top 3 podcasts with Hattie Zhou
Ranked by the Snipd community

50 snips
Oct 14, 2022 • 1h 47min
Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
Hattie Zhou is a Ph.D. student at Mila working with Hugo Larochelle and Aaron Courville. Her research focuses on understanding how and why neural networks work, starting with deconstructing why lottery tickets work and most recently exploring how forgetting may be fundamental to learning. Prior to Mila, she was a data scientist at Uber and did research with Uber AI Labs. In this episode, we chat about supermasks and sparsity, coherent gradients, iterative learning, fortuitous forgetting, and much more.

43 snips
Feb 16, 2023 • 1h 43min
Hattie Zhou: Lottery Tickets and Algorithmic Reasoning in LLMs
In episode 60 of The Gradient Podcast, Daniel Bashir speaks to Hattie Zhou.Hattie is a PhD student at the Université de Montréal and Mila. Her research focuses on understanding how and why neural networks work, based on the belief that the performance of modern neural networks exceeds our understanding and that building more capable and trustworthy models requires bridging this gap. Prior to Mila, she spent time as a data scientist at Uber and did research with Uber AI Labs.Have suggestions for future podcast guests (or other feedback)? Let us know here!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:55) Hattie’s Origin Story, Uber AI Labs, empirical theory and other sorts of research* (10:00) Intro to the Lottery Ticket Hypothesis & Deconstructing Lottery Tickets* (14:30) Lottery tickets as lucky initialization* (17:00) Types of masking and the “masking is training” claim* (24:00) Type-0 masks and weight evolution over long training trajectories* (27:00) Can you identify good masks or training trajectories a priori?* (29:00) The role of signs in neural net initialization* (35:27) The Supermask* (41:00) Masks to probe pretrained models and model steerability* (47:40) Fortuitous Forgetting in Connectionist Networks* (54:00) Relationships to other work (double descent, grokking, etc.)* (1:01:00) The iterative training process in fortuitous forgetting, scale and value of exploring alternatives* (1:03:35) In-Context Learning and Teaching Algorithmic Reasoning* (1:09:00) Learning + algorithmic reasoning, prompting strategy* (1:13:50) What’s happening with in-context learning?* (1:14:00) Induction heads* (1:17:00) ICL and gradient descent* (1:22:00) Algorithmic prompting vs discovery* (1:24:45) Future directions for algorithmic prompting* (1:26:30) Interesting work from NeurIPS 2022* (1:28:20) Hattie’s perspective on scientific questions people pay attention to, underrated problems* (1:34:30) Hattie’s perspective on ML publishing culture* (1:42:12) OutroLinks:* Hattie’s homepage and Twitter* Papers* Deconstructing Lottery Tickets: Zeros, signs, and the Supermask* Fortuitous Forgetting in Connectionist Networks* Teaching Algorithmic Reasoning via In-context Learning Get full access to The Gradient at thegradientpub.substack.com/subscribe

Dec 20, 2022 • 21min
#91 - HATTIE ZHOU - Teaching Algorithmic Reasoning via In-context Learning #NeurIPS
In an engaging conversation, Hattie Zhou, a PhD student at Université de Montréal and Mila, discusses her groundbreaking work on teaching algorithmic reasoning to large language models at Google Brain. She outlines the four essential stages for this task, including how to combine and use algorithms as tools. Hattie also shares innovative strategies for enhancing the reasoning capabilities of these models, the computational limits they face, and the exciting prospects for their applications in mathematical conjecturing.