

Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting
50 snips Oct 14, 2022
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Introduction
00:00 • 2min
Is the Model Doing Some Kind of Implicit Curriculum Learning While It's Being Trained?
02:09 • 2min
Is There a Trick for Generalization?
04:13 • 3min
How to Get an Extremely Adversarial Initialization
06:45 • 2min
The Lottery Ticket Hypothesis
08:38 • 2min
How to Train a Sparse Network?
10:52 • 2min
The Super Mask
13:05 • 5min
Is There a Way to Change Architectures?
18:01 • 4min
How to Find Sub-Networks That Are the Right Shape of the Solution?
21:44 • 4min
Is There a Sparse Architecture in Computer Vision?
25:34 • 2min
What's the Biggest Takeaway From the Lottery Ticket Hypothesis Paper?
27:47 • 2min
Is There a Better Way to Train Super Masks?
29:55 • 2min
Is There a Way to Control the Behavior of Pre-Trained Models?
31:59 • 3min
What Is a Not-Correlational?
34:56 • 3min
The Zero Values Are Relevant Still. Is That Really a Good Idea?
37:28 • 5min
The Story of Coherent Gradients
42:14 • 2min
Tendexure, I Love That!
43:48 • 3min
Increasing Compositionality Through Iterative Learning
46:55 • 5min
Is There a Way to Improve Model Performance?
51:29 • 3min
Is It a Dropout Intuition?
54:38 • 2min
The Later Layers Are Learning of the Tile of All Features
56:10 • 2min
The Fortuitous Forgetting Paper
57:42 • 5min
Is Knowledge Evolution Really Useful in Transfer Learning?
01:02:18 • 5min
Unsupervised Environment Design
01:07:20 • 3min
Are You Using Your Model to Identify Desirable Versus Unwanted Information?
01:09:59 • 4min
Is There a Generalization of Desirable Versus Unwanted?
01:13:45 • 3min
Getting Rid of Specifics for Spurious Features
01:16:28 • 2min
Is There a Difference Between a Cow and a Grass Cow?
01:18:04 • 2min
How Do You Interpret a Scene?
01:19:37 • 2min
Is There a Way to Unlearn?
01:21:55 • 5min
Is There a Difference Between Chris Ola and Grande?
01:26:57 • 3min
Is There a Limit to Context Learning?
01:30:00 • 3min
Is Your Model Not Reasoning?
01:33:00 • 3min
Is There a Scalable Compositionality?
01:36:28 • 5min
Compositional and Trans-Horror Red Scale Part 2
01:41:42 • 2min
Is There a Culture That Makes You More Effective?
01:44:11 • 3min