
Don't Worry About the Vase Podcast Gemini 3: Model Card and Safety Framework Report
Nov 21, 2025
Dive into the intricacies of Gemini 3's model card and safety framework! Discover the highlights of its performance benchmarks and the controversy around safety testing transparency. Explore risks associated with CBRN assessments and cybersecurity challenges. Zvi reveals intriguing manipulative strategies and the opacity of testing methods. With insights into machine learning research and potential misalignment issues, the discussion wraps up with a candid assessment of practical risks and safety concerns.
AI Snips
Chapters
Transcript
Episode notes
Strong Model, Familiar Failure Modes
- Zvi finds Gemini 3 Pro excellent but incrementally more Gemini-like in failure modes.
- The model optimizes for training objectives, causing hallucinations and glazing.
Bigger Context, Smaller Disclosures
- Gemini 3 is a fresh architecture with MOE multimodal support and huge context windows.
- Google discloses minimal architecture and data details, limiting independent assessment.
Opacity Masks Safety Tradeoffs
- The safety reporting is opaque and worse than peers in presentation and transparency.
- Zvi attributes increased unjustified refusals to risk aversion and being 'fun police.'
