

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko
Feb 20, 2024
In this podcast, Kirill Eremenko, the SuperDataScience founder, discusses full encoder-decoder transformers with Jon Krohn. They cover topics like how cross-attention works, the importance of masking during self-attention, and the collaboration dynamics in transformer research. The episode provides a detailed explanation of encoder-decoder transformers, language models, and the use of transformers in natural language processing.
Chapters
Transcript
Episode notes
1 2 3 4 5 6 7 8 9 10 11 12
Intro
00:00 • 3min
Recording Location and Listener Engagement
03:29 • 7min
Recap and Discussion on Transformers from a Foundational Research Paper
10:35 • 2min
Collaboration Dynamics and Sequential Functioning in Transformers
12:51 • 19min
Exploration of Transformers, Language Models, and Outputting Pixels vs. Words
31:40 • 2min
Understanding Full Encoder-Decoder Transformers for Translation Tasks
33:57 • 16min
Exploring RAG and Fine-Tuning Embedding Models for Chatbots with HPE and Intel
49:41 • 3min
Understanding Transformer Models in Natural Language Processing
52:14 • 28min
Comparison of Encoder-only Models and Decoder-only Models in Transformers
01:20:28 • 15min
Discussion on Fine-Tuning Transformers and Encoder-Decoder Transformers
01:35:06 • 3min
Exploring the Growth of a Vibrant Learning Ecosystem and a Book Recommendation on Achieving Success through Giving
01:38:05 • 2min
Detailed Explanation of Encoder-Decoder Transformers in Transformers
01:40:14 • 3min