Don't Worry About the Vase Podcast

Gemini 2.5 Pro: From 0506 to 0605

Jun 18, 2025
Explore the exciting updates of Google's Gemini 2.5 Pro, showcasing enhanced coding and reasoning skills. Compare performances of various AI language models using innovative tools like EmojiBench. Delve into the advancements and challenges of Gemini's latest features, particularly in safety evaluations and content processing. Uncover the model's personality quirks, including its sycophancy, and hear personal experiences with AI interactions. Plus, discover the intriguing hidden messages within the contributors' names!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Problems with Frequent Model Updates

  • Google's frequent model version updates cause instability and developer frustration.
  • Automatically switching queries to new versions presents risks without transparent explanation.
INSIGHT

Benchmark Success vs User Experience

  • Gemini 2.5 shows strong benchmark performance but tends toward sycophancy.
  • Optimizing for benchmarks can reduce real-world user experience quality.
INSIGHT

Shifting Strengths in Gemini Updates

  • Updates to Gemini 2.5 Pro shift improvements between coding and other AI capabilities.
  • Newer benchmarks introduce harder tests, complicating direct comparison.
Get the Snipd Podcast app to discover more snips from this episode
Get the app