Everyday AI Podcast – An AI and ChatGPT Podcast

Ep 662: Opus 4.5: New king of the AI hill or just a niche model for coders?

68 snips

Nov 26, 2025

The latest AI showdown has arrived with the debut of Claude Opus 4.5, claimed to be the best model for coding and agentic tasks. Is this the new go-to for developers or just a niche player? A deep dive into its benchmarks reveals a mixed performance compared to Gemini 3 Pro. Exciting features like document consistency and a Chrome extension make waves, while live demos reveal both strengths and limitations. Join the discussion about whether Opus 4.5 will reign supreme or serve a specialized audience!

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Opus 4.5 Is A Vertical Power Play

Anthropic's Opus 4.5 targets coding, agents, and computer use as its core strengths.
Jordan highlights it as a focused vertical play rather than a general-purpose model.

INSIGHT

Benchmarks Are Nuanced Not Absolute

Benchmarks show Opus 4.5 leads in agentic and software-engineering tasks but not uniformly across all third-party aggregates.
Jordan notes Gemini 3 Pro and GPT-5 variants still outperform Opus on several aggregated coding indexes.

INSIGHT

Anthropic's Strategy Is Vertical Specialization

Anthropic appears to focus its model development on vertical specialties like engineering and finance rather than broad creative tasks.
Jordan sees this as a deliberate strategy away from general-purpose creativity toward domain strength.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

"... best model in the world..." 🤔

Wait, again?

Days after Gemini 3 Pro splashed on the scene, Anthropic snuck in a low-key drop in Claude Opus 4.5.

And Anthropic pulled no punches, calling its new model the "best model in the world for coding, agents and computer use"

So, should you be hot swapping your Gemini or ChatGPT use out for the new Opus 4.5? Or, is this model more of a niche for software devs?

Tune in, as we put AI to Work on Wednesday!

Opus 4.5: New king of the AI hill or just a niche model for coders?

P.S.... we're out for Thanksgiving. So after this show, we'll see ya Monday!

Newsletter: Sign up for our free daily newsletter
More on this Episode: Episode Page
Join the discussion:Thoughts on this? Join the convo and connect with other AI leaders on LinkedIn.

Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineup
Website: YourEverydayAI.com
Email The Show: info@youreverydayai.com
Connect with Jordan on LinkedIn

Topics Covered in This Episode:

Claude Opus 4.5 Release & Overview
Anthropic's Coding & Agentic Task Benchmarks
Opus 4.5 vs Gemini 3 Pro Comparison
API Price Cut & Cost Efficiency
Agentic Tool Search and Context Compaction
Multimodal Vision Features & Zoom Tool
Claude for Excel & Enterprise Data Workflows
Chrome Extension and Desktop App Updates

Timestamps:
00:00 "AI Throne: Gemini vs. Claude"

05:38 "Trends Dashboard with Claude Tools"

09:25 "AI Model Benchmark Showdown"

11:17 "Benchmark Comparison: Coding Models"

13:49 AI Models for Software Engineering

18:24 Claude API Pricing Slashed

21:09 "Multi-Agent Models & Vision Tools"

25:48 "Claude Chrome Extension Access Update"

28:36 "Claude for Excel Launches"

30:50 "Chat Prompt Context Limitations"

35:45 "Improving AI Chain of Thought"

38:50 "Analyzing Podcast Trends with AI"

40:41 "AI Tools for Building Apps"

Keywords:
Claude Opus 4.5, Anthropic, benchmark leader, AI hill, coding model, software engineering, agentic research, data analysis, Opus 4.5 API price cut, sweep bench verified, coding capabilities, tool orchestration, context compaction, infinite chat, effort parameter, agentic tasks, API pricing reduction, front end AI chatbot, Chrome extension, Claude for Chrome, browser prompt injection, Claude desktop app, Claude code, Excel integration, Claude for Excel, programmatic tool calling, multimodal benchmark, vision capabilities, Zoom tool, document creation, file creation feature, PowerPoint generation, SaaS dashboard, chain of thought reasoning, token limits, context window issues

Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)

Ready for ROI on GenAI? Go to youreverydayai.com/partner