AI Snips
Chapters
Transcript
Episode notes
Opus 4.5 Raises Performance Bars
- Opus 4.5 sets new performance bars, including the first >80% SWE bench score.
- Anthropic also excels at tool use like spreadsheets and terminal tasks, showing broad gains.
Validate Models With Real Products
- Try the demonstrated products to judge model quality rather than trusting benchmarks alone.
- Expect paid tiers to get early access and broader rollout later as capacity grows.
Selective Memory Beats Pure Context Size
- Opus 4.5 improves long-context quality but emphasizes selective memory over window size alone.
- Remembering the right details and when to access them is as important as raw context length.


