Machine Learning Street Talk (MLST) cover image

Machine Learning Street Talk (MLST)

Transformers Need Glasses! - Federico Barbero

Mar 8, 2025
Federico Barbero, a lead author at DeepMind/Oxford, dives into the quirks of transformers and why large language models falter at tasks like counting. He reveals fascinating architectural bottlenecks that affect their performance. By drawing parallels with graph neural networks, he sheds light on the softmax function's role in limiting decision-making clarity. But not all hope is lost! Federico shares innovative 'glasses' to enhance transformer performance, including input tweaks and structural modifications to boost their clarity and efficiency.
01:00:54

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Transformers struggle with maintaining information fidelity in long sequences, leading to inaccuracies especially with last tokens due to representational collapse.
  • The order of tokens greatly affects transformer performance, revealing inherent biases favoring earlier tokens and resulting in knowledge gaps later in sequences.

Deep dives

Understanding Heuristics in AI Models

The episode discusses the limitations of current AI models, particularly in how they approach reasoning tasks. It argues that these models are not truly implementing algorithms but rather relying on heuristics that perform adequately during training. However, their performance deteriorates significantly when applied to unseen data, demonstrating a lack of robustness in generalization. This highlights a fundamental issue in AI, where models may produce results that appear intelligent but fail to capture the underlying complexities of reasoning.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode