LessWrong (Curated & Popular)

[Linkpost] “Gemini Diffusion: watch this space” by Yair Halberstadt

May 22, 2025
Google DeepMind's Gemini Diffusion is shaking up AI with its innovative iterative denoising method, revolutionizing how output is generated. The speaker highlights its impressive speed, achieving nearly 1000 tokens per second, and recalls a personal encounter where it aced a Google interview question in record time. With potential far beyond current language models, this technology could redefine AI capabilities. While it’s not flawless, its performance shows that the future is brimming with exciting possibilities.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Diffusion vs Token Prediction

  • Diffusion models iteratively denoise all output tokens until coherent results emerge, unlike LLMs predicting one token at a time.
  • This approach is inspired by image diffusion models and offers unique generation advantages.
ANECDOTE

Gemini Diffusion Interview Demo

  • Google DeepMind's Gemini Diffusion answered my Google interview question perfectly in 2 seconds.
  • It struggled a bit on follow-ups but still outperforms ChatGPT 3 significantly.
INSIGHT

Diffusion Enables Native Editing

  • Diffusion models allow native editing in the middle of outputs, unlike sequence-based LLMs.
  • Since output is generated all at once, internal contradictions common in LLM outputs are minimized.
Get the Snipd Podcast app to discover more snips from this episode
Get the app