
Mixture of Experts Google’s Gemini 3: AI agents, reasoning and search mode
25 snips
Nov 21, 2025 This discussion features Gabe Goodhart, an AI architect focused on cybersecurity, Merve Unuvar, a specialist in agent middleware, and Marina Danilevsky, a research scientist analyzing AI model behaviors. They dive into Google’s Gemini 3 model, exploring its strong performance yet concerning hallucination issues. The dialogue shifts to AI's impact on the economy through OpenAI’s GDPVal benchmark, and the panel debates the balance between specialized and generalist models. They also tackle the implications of a recent cyberattack automated by AI, stressing the need for robust enterprise defenses.
AI Snips
Chapters
Transcript
Episode notes
Benchmarks Don’t Erase Hallucinations
- Gemini 3 shows major benchmark gains but still hallucinates and prefers giving answers over admitting uncertainty.
- Marina Danilevsky observed the model remains prone to mistakes despite strong benchmark performance.
Agentic IDE As Differentiator
- Google aims to differentiate Gemini 3 via novel agent features like an agentic IDE and anti-gravity editing.
- Gabe Goodhart noted this could enable managing fleets of delegate worker agents for parallel tasks.
Quick UI Build, Awkward Personalization Error
- Merve Unuvar built a workout dashboard with Gemini that generated a UI in minutes but made a personalized error about post-workout growth.
- She used the model to create streamlit UI quickly but the model misapplied age-based advice.
