Generally Intelligent

Episode 20: Hattie Zhou, Mila, on supermasks, iterative learning, and fortuitous forgetting

50 snips
Oct 14, 2022
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 2min
2
Is the Model Doing Some Kind of Implicit Curriculum Learning While It's Being Trained?
02:09 • 2min
3
Is There a Trick for Generalization?
04:13 • 3min
4
How to Get an Extremely Adversarial Initialization
06:45 • 2min
5
The Lottery Ticket Hypothesis
08:38 • 2min
6
How to Train a Sparse Network?
10:52 • 2min
7
The Super Mask
13:05 • 5min
8
Is There a Way to Change Architectures?
18:01 • 4min
9
How to Find Sub-Networks That Are the Right Shape of the Solution?
21:44 • 4min
10
Is There a Sparse Architecture in Computer Vision?
25:34 • 2min
11
What's the Biggest Takeaway From the Lottery Ticket Hypothesis Paper?
27:47 • 2min
12
Is There a Better Way to Train Super Masks?
29:55 • 2min
13
Is There a Way to Control the Behavior of Pre-Trained Models?
31:59 • 3min
14
What Is a Not-Correlational?
34:56 • 3min
15
The Zero Values Are Relevant Still. Is That Really a Good Idea?
37:28 • 5min
16
The Story of Coherent Gradients
42:14 • 2min
17
Tendexure, I Love That!
43:48 • 3min
18
Increasing Compositionality Through Iterative Learning
46:55 • 5min
19
Is There a Way to Improve Model Performance?
51:29 • 3min
20
Is It a Dropout Intuition?
54:38 • 2min
21
The Later Layers Are Learning of the Tile of All Features
56:10 • 2min
22
The Fortuitous Forgetting Paper
57:42 • 5min
23
Is Knowledge Evolution Really Useful in Transfer Learning?
01:02:18 • 5min
24
Unsupervised Environment Design
01:07:20 • 3min
25
Are You Using Your Model to Identify Desirable Versus Unwanted Information?
01:09:59 • 4min
26
Is There a Generalization of Desirable Versus Unwanted?
01:13:45 • 3min
27
Getting Rid of Specifics for Spurious Features
01:16:28 • 2min
28
Is There a Difference Between a Cow and a Grass Cow?
01:18:04 • 2min
29
How Do You Interpret a Scene?
01:19:37 • 2min
30
Is There a Way to Unlearn?
01:21:55 • 5min
31
Is There a Difference Between Chris Ola and Grande?
01:26:57 • 3min
32
Is There a Limit to Context Learning?
01:30:00 • 3min
33
Is Your Model Not Reasoning?
01:33:00 • 3min
34
Is There a Scalable Compositionality?
01:36:28 • 5min
35
Compositional and Trans-Horror Red Scale Part 2
01:41:42 • 2min
36
Is There a Culture That Makes You More Effective?
01:44:11 • 3min