Gradient Dissent: Conversations on AI

Neural Network Pruning and Training with Jonathan Frankle at MosaicML

Apr 4, 2023
Ask episode
Chapters
Transcript
Episode notes
1
Introduction
00:00 • 3min
2
The Power of Weights in Neural Networks
02:35 • 3min
3
The Lottery Ticket Hypothesis
05:36 • 2min
4
The Sub-Network and the Dense Network
07:24 • 2min
5
The Effects of Dropout on the Network
08:59 • 2min
6
How Much Pruning Can You Prune?
11:00 • 2min
7
The Unscientific Nature of Neural Networks
13:12 • 2min
8
Transformers and Attention: A Simple Architecture
15:31 • 2min
9
How to Speed Up Training With ResNet 50
17:50 • 4min
10
Mosaic and Mal: The Story Behind the Company
21:52 • 3min
11
How to Train a Machine Learning Model
25:06 • 2min
12
How to Engage With the Research Community
27:35 • 3min
13
The Skeptics of AGI
30:48 • 2min
14
The Importance of Feed Forward Networks
32:25 • 2min
15
The Future of GPT Chat
33:59 • 2min
16
The Importance of Policy
35:44 • 3min
17
The Future of Language and Vision Models
39:04 • 2min
18
The Importance of Adapting to Changes in the Chip World
41:03 • 3min
19
The Future of Diffusion Models
44:01 • 2min
20
The Importance of Data in Mosaic
45:56 • 4min
21
The Importance of Data in Large Language Model Training
49:41 • 2min
22
The Future of Data Curation and Data Labeling
51:20 • 2min
23
The Importance of Data Quality in Machine Learning
53:19 • 2min
24
The Unexpected Bottlenecks in Modeling
55:43 • 7min