
759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
Detailed Explanation of Encoder-Decoder Transformers in Transformers
This chapter delves into the intricacies of the encoder-decoder structures in transformers, emphasizing the role of cross-attention to combine encoder and decoder functionality. It explains the differences in capabilities between encoder-only and decoder-only architectures, as well as the importance of masking during self-attention for preventing lookahead cheating in generation tasks.
Transcript
Play full episode