Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12 13
Introduction
00:00 • 4min
The Most Elevant Heads Are the Most Confident Heads
04:14 • 2min
The Importance of Heads
06:40 • 3min
How to Prune a Model?
09:36 • 2min
The Blue Squirr Is a Modo Traind on Wt and Ation
12:00 • 2min
The Importance of Attention in the Model
13:54 • 2min
Train From Scratch Modal Heads?
15:32 • 2min
Transfornoron - A Study on Token Representation in Transformers
17:08 • 5min
The Difference Between Mamelem and Naxtal Caraters
22:05 • 3min
The Effects of Different Views on the Same Data
25:11 • 4min
Machine Translation - Is There a Difference Between the Two Stages?
29:39 • 2min
How to Analyze the Representations of Tokens in Space
31:28 • 4min
What Are the Implications of the Results?
35:03 • 2min