Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Introduction
00:00 • 3min
The Future of B2B SAS and AIML
03:01 • 3min
Is Attention All You Need?
05:32 • 3min
Is Parallelization a Good Idea?
08:24 • 2min
The Intersection of Crypto and AI?
09:58 • 2min
The Weirdest Furthest Out in a Little While
11:36 • 2min
Machine Learning Model Weights - What's Happening There?
13:30 • 2min
The Basic Algorithm of Machine Learning Models
15:07 • 2min
How to Do a Machine Translation Task?
16:49 • 3min
Are We Still in the World of Neural Networks?
19:41 • 2min
Neural Networks - The First AI Winter
22:09 • 3min
The Core Architecture in a Transformer
25:24 • 2min
The Position Encoding in a Recurrent Model
27:33 • 2min
Is Attention a Multi-Head Mechanism?
29:06 • 4min
Is There a Best Explanation for Why It Works?
32:49 • 2min
The Chinchilla Scaling Laws
34:45 • 3min
Is There a Need for New Architectures?
37:23 • 3min
What's Order Regressive?
40:17 • 2min
Is It Order Regressive?
42:01 • 3min
The Multi-Headed Attention Layer in the Encoder
45:22 • 3min
Is GPT 3 Parallelisable?
47:59 • 5min
Instruction GPT Is Better at Attending to What You're Asking It to Do
52:56 • 3min
How Do You Train a Model?
56:24 • 2min
Train a Model
58:10 • 4min
GPUs for Machine Learning
01:01:41 • 2min
Is the TPT So Expensive to Train You Cost $6 Million?
01:03:50 • 3min
Is There a Paper That You Should Read on Chinchillas
01:06:23 • 2min