

Claude 4 You: The Quest for Mundane Utility
May 26, 2025
Explore the strengths and weaknesses of Claude Opus 4 and Sonnet 4 as they navigate the competitive AI landscape. Discover their mixed performance in coding and visual comprehension while showcasing innovative resource allocation techniques. Get insights into AI subscription models and the disconnect between industry trends and functionality. Plus, enjoy a humorous take on the clichés found in AI product announcements that lead to a surprising bingo fail!
AI Snips
Chapters
Transcript
Episode notes
Claude 4 Model Quality
- Claude Opus 4 is currently the best model if price and speed are no concerns. Sonnet 4 is strong but less of a leap over prior versions.
Sonnet 4 Long Context Strength
- Sonnet 4 excels at very long contexts, maintaining high performance even at 60 to 98K tokens. Image tasks remain a weakness for Claude models.
Visual Physics Weakness & Regression
- Claude models underperform in visual physics comprehension tests compared to competitors like OpenAI models. Gemini 2.5 Pro regression in non-coding tasks is notable.