Anton Teaches Packy AI

1

Introduction

00:00 • 3min

2

The Future of B2B SAS and AIML

03:01 • 3min

3

Is Attention All You Need?

05:32 • 3min

4

Is Parallelization a Good Idea?

08:24 • 2min

5

The Intersection of Crypto and AI?

09:58 • 2min

6

The Weirdest Furthest Out in a Little While

11:36 • 2min

7

Machine Learning Model Weights - What's Happening There?

13:30 • 2min

8

The Basic Algorithm of Machine Learning Models

15:07 • 2min

9

How to Do a Machine Translation Task?

16:49 • 3min

10

Are We Still in the World of Neural Networks?

19:41 • 2min

11

Neural Networks - The First AI Winter

22:09 • 3min

12

The Core Architecture in a Transformer

25:24 • 2min

13

The Position Encoding in a Recurrent Model

27:33 • 2min

14

Is Attention a Multi-Head Mechanism?

29:06 • 4min

15

Is There a Best Explanation for Why It Works?

32:49 • 2min

16

The Chinchilla Scaling Laws

34:45 • 3min

17

Is There a Need for New Architectures?

37:23 • 3min

18

What's Order Regressive?

40:17 • 2min

19

Is It Order Regressive?

42:01 • 3min

20

The Multi-Headed Attention Layer in the Encoder

45:22 • 3min

21

Is GPT 3 Parallelisable?

47:59 • 5min

22

Instruction GPT Is Better at Attending to What You're Asking It to Do

52:56 • 3min

23

How Do You Train a Model?

56:24 • 2min

24

Train a Model

58:10 • 4min

25

GPUs for Machine Learning

01:01:41 • 2min

26

Is the TPT So Expensive to Train You Cost $6 Million?

01:03:50 • 3min

27

Is There a Paper That You Should Read on Chinchillas

01:06:23 • 2min

Anton Teaches Packy AI | E1