
747: Technical Intro to Transformers and LLMs, with Kirill Eremenko
Super Data Science: ML & AI Podcast with Jon Krohn
00:00
Exploring the Technical Aspects of Transformers and Large Language Models
This chapter dives deep into the technical aspects of transformers and large language models, discussing their capabilities in parallelization, handling different input sizes, and training with all available data. It elaborates on the power of transformers in processing data efficiently through matrix operations, attention heads, and layering for increased learning capacity. The conversation also touches on the limitations of models like GPT in lacking reasoning abilities despite excelling in tasks like generation and translation.
Transcript
Play full episode