This Day in AI Podcast

gpt-realtime, nano banana & workspace computer v2 | EP99.15-realtime

104 snips
Aug 29, 2025
The podcast dives into the exciting world of AI innovations, featuring real-time capabilities of GPT technology alongside the launch of Gemini 2.5. The discussion humorously critiques the pricing of advanced models, while emphasizing their transformative potential in various industries. Listeners get insights into the evolving cloud-based workspaces and how tools like SimLink can enhance productivity. Plus, there's exploration of the creative possibilities with PixVerse V5, showcasing impressive video transitions that merge creativity with cutting-edge technology.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Real-Time API Enables Delegating Voice Agents

  • GPT Real-Time adds image inputs, SIP voice calling, and remote MCP support to enable richer voice agents.
  • Asynchronous tool calls let a lightweight voice model coordinate powerful background assistants for complex tasks.
INSIGHT

Orchestrate Assistants To Reduce Cost And Latency

  • Use a real-time voice front-end that delegates heavy work asynchronously to specialist assistants.
  • Return only concise summaries to the voice model to keep latency, cost and hallucinations low.
ANECDOTE

Live Demo: Multilingual, Accent-Switching Voice

  • Michael and Chris demoed Marin voice switching languages and accents seamlessly in a short clip.
  • The model handled Spanish, Chinese and an Australian accent during rapid back-and-forth testing.
Get the Snipd Podcast app to discover more snips from this episode
Get the app