

36 - Attention Is All You Need, with Ashish Vaswani and Jakob Uszkoreit
Oct 23, 2017
Ashish Vaswani and Jakob Uszkoreit, co-authors of the "Attention Is All You Need" paper, discuss the motivation behind replacing RNNs and CNNs with a self-attention mechanism in the Transformer model. They delve into topics such as the positional encoding mechanism, multi-headed attention, replacing encoders in other models, and what self-attention actually learns. They highlight how lower layers learn n-grams and higher layers learn coreference, showcasing the power of the self-attention mechanism.
Chapters
Transcript
Episode notes
1 2 3 4 5 6
Introduction
00:00 • 6min
Self-Attention Mechanism for Parallelization and Dependency Connections
05:45 • 18min
Exploring the Working of a Model and the Importance of Sinusoids in Embeddings
23:15 • 2min
Discussion on Hypotheses, Substitutions, and Dependencies in the Learning Process
25:08 • 2min
Exploring a New Encoder for Various Tasks
27:20 • 3min
Using Attention Mechanisms in Natural Language Processing
30:50 • 10min