The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Google I/O 2025 Special Edition - #733

340 snips

May 28, 2025

Guest

Logan Kilpatrick

Guest

Shrestha Basu Mallick

Logan Kilpatrick and Shrestha Basu Mallick from Google DeepMind dive into groundbreaking advancements from Google I/O 2025. They discuss the Gemini API's impressive features like thinking budgets and thought summaries, enhancing voice AI’s expressiveness with native audio output. The duo shares insights on the challenges of building real-time voice applications, including latency and voice detection. They also send a playful wish list for next year's event, dreamily aiming for enhanced language capabilities to foster global inclusivity.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

One Unified Gemini Model

Google focuses on creating one comprehensive Gemini model with multiple capabilities integrated.
This approach avoids splintering into many separate models, enabling powerful combined functionalities.

INSIGHT

Control with Thinking Budgets

Thinking budgets and thought summaries give developers control over reasoning in Gemini 2.5 Pro.
These features aim to balance reasoning needs and transparency for developers.

INSIGHT

Native Audio and URL Context

Native audio output enables expressive, multilingual voice AI with realistic switching.
URL Context tool adds in-depth webpage information retrieval for research agents.

Get the Snipd Podcast app to discover more snips from this episode

Get the app