The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Google I/O 2025 Special Edition - #733

284 snips
May 28, 2025
Logan Kilpatrick and Shrestha Basu Mallick from Google DeepMind dive into groundbreaking advancements from Google I/O 2025. They discuss the Gemini API's impressive features like thinking budgets and thought summaries, enhancing voice AI’s expressiveness with native audio output. The duo shares insights on the challenges of building real-time voice applications, including latency and voice detection. They also send a playful wish list for next year's event, dreamily aiming for enhanced language capabilities to foster global inclusivity.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

One Unified Gemini Model

  • Google focuses on creating one comprehensive Gemini model with multiple capabilities integrated.
  • This approach avoids splintering into many separate models, enabling powerful combined functionalities.
INSIGHT

Control with Thinking Budgets

  • Thinking budgets and thought summaries give developers control over reasoning in Gemini 2.5 Pro.
  • These features aim to balance reasoning needs and transparency for developers.
INSIGHT

Native Audio and URL Context

  • Native audio output enables expressive, multilingual voice AI with realistic switching.
  • URL Context tool adds in-depth webpage information retrieval for research agents.
Get the Snipd Podcast app to discover more snips from this episode
Get the app