Don't Worry About the Vase Podcast

Claude 4 You: The Quest for Mundane Utility

May 26, 2025
Explore the strengths and weaknesses of Claude Opus 4 and Sonnet 4 as they navigate the competitive AI landscape. Discover their mixed performance in coding and visual comprehension while showcasing innovative resource allocation techniques. Get insights into AI subscription models and the disconnect between industry trends and functionality. Plus, enjoy a humorous take on the clichés found in AI product announcements that lead to a surprising bingo fail!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Claude 4 Model Quality

  • Claude Opus 4 is currently the best model if price and speed are no concerns. Sonnet 4 is strong but less of a leap over prior versions.
INSIGHT

Sonnet 4 Long Context Strength

  • Sonnet 4 excels at very long contexts, maintaining high performance even at 60 to 98K tokens. Image tasks remain a weakness for Claude models.
INSIGHT

Visual Physics Weakness & Regression

  • Claude models underperform in visual physics comprehension tests compared to competitors like OpenAI models. Gemini 2.5 Pro regression in non-coding tasks is notable.
Get the Snipd Podcast app to discover more snips from this episode
Get the app