Super Data Science: ML & AI Podcast with Jon Krohn

759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

Feb 20, 2024

Kirill Eremenko

In this podcast, Kirill Eremenko, the SuperDataScience founder, discusses full encoder-decoder transformers with Jon Krohn. They cover topics like how cross-attention works, the importance of masking during self-attention, and the collaboration dynamics in transformer research. The episode provides a detailed explanation of encoder-decoder transformers, language models, and the use of transformers in natural language processing.

Ask episode

Chapters

Transcript

Episode notes

Recording Location and Listener Engagement

Recap and Discussion on Transformers from a Foundational Research Paper

Collaboration Dynamics and Sequential Functioning in Transformers

12:51 • 19min

Exploration of Transformers, Language Models, and Outputting Pixels vs. Words

Understanding Full Encoder-Decoder Transformers for Translation Tasks

33:57 • 16min

Exploring RAG and Fine-Tuning Embedding Models for Chatbots with HPE and Intel

Understanding Transformer Models in Natural Language Processing

52:14 • 28min

Comparison of Encoder-only Models and Decoder-only Models in Transformers

01:20:28 • 15min

Discussion on Fine-Tuning Transformers and Encoder-Decoder Transformers

01:35:06 • 3min

Exploring the Growth of a Vibrant Learning Ecosystem and a Book Recommendation on Achieving Success through Giving

01:38:05 • 2min

Detailed Explanation of Encoder-Decoder Transformers in Transformers

01:40:14 • 3min