Google’s Gemini 3: AI agents, reasoning and search mode

25 snips

Nov 21, 2025

Guest

Marina Danilevsky

This discussion features Gabe Goodhart, an AI architect focused on cybersecurity, Merve Unuvar, a specialist in agent middleware, and Marina Danilevsky, a research scientist analyzing AI model behaviors. They dive into Google’s Gemini 3 model, exploring its strong performance yet concerning hallucination issues. The dialogue shifts to AI's impact on the economy through OpenAI’s GDPVal benchmark, and the panel debates the balance between specialized and generalist models. They also tackle the implications of a recent cyberattack automated by AI, stressing the need for robust enterprise defenses.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Benchmarks Don’t Erase Hallucinations

Gemini 3 shows major benchmark gains but still hallucinates and prefers giving answers over admitting uncertainty.
Marina Danilevsky observed the model remains prone to mistakes despite strong benchmark performance.

INSIGHT

Agentic IDE As Differentiator

Google aims to differentiate Gemini 3 via novel agent features like an agentic IDE and anti-gravity editing.
Gabe Goodhart noted this could enable managing fleets of delegate worker agents for parallel tasks.

ANECDOTE

Quick UI Build, Awkward Personalization Error

Merve Unuvar built a workout dashboard with Gemini that generated a UI in minutes but made a personalized error about post-workout growth.
She used the model to create streamlit UI quickly but the model misapplied age-based advice.

Get the Snipd Podcast app to discover more snips from this episode

Get the app