

[Linkpost] “Gemini Diffusion: watch this space” by Yair Halberstadt
May 22, 2025
Google DeepMind's Gemini Diffusion is shaking up AI with its innovative iterative denoising method, revolutionizing how output is generated. The speaker highlights its impressive speed, achieving nearly 1000 tokens per second, and recalls a personal encounter where it aced a Google interview question in record time. With potential far beyond current language models, this technology could redefine AI capabilities. While it’s not flawless, its performance shows that the future is brimming with exciting possibilities.
AI Snips
Chapters
Transcript
Episode notes
Diffusion vs Token Prediction
- Diffusion models iteratively denoise all output tokens until coherent results emerge, unlike LLMs predicting one token at a time.
- This approach is inspired by image diffusion models and offers unique generation advantages.
Gemini Diffusion Interview Demo
- Google DeepMind's Gemini Diffusion answered my Google interview question perfectly in 2 seconds.
- It struggled a bit on follow-ups but still outperforms ChatGPT 3 significantly.
New AI Species with Diffusion
- Gemini Diffusion represents a new 'species' of AI different from humans and large language models.
- Diffusion models produce entire output blocks at once, avoiding token-by-token contradiction errors seen in LLMs.