Better Offline

Exclusive: How GPT-5 Actually Works

101 snips
Aug 22, 2025
Discover the intricate architecture and functionality of GPT-5, showcasing its advancements over previous versions. Dive into the critical conversation about the hype versus reality, as mixed public reactions unfold. Explore the complexities of its new routing system and the implications for user experience. Unpack token dynamics and address the challenges faced during the launch. Finally, critique the speculative nature of AI investments while considering the model's potential impact on everyday life.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Theo Brown's Demo Then Retraction

  • Theo Brown demoed GPT-5 at OpenAI and initially praised its performance, calling it an "oh, fuck" moment.
  • He later said the public model didn't match the in-office experience and publicly corrected his take.
INSIGHT

In-Office Demos May Use Heavier Compute

  • Ed Zitron suspects OpenAI demoed heavier compute versions at their offices, producing better outputs than the public release.
  • He argues that such differences signal inconsistent model performance and possible compute throttling in public deployment.
INSIGHT

Router-First Architecture Raises Costs

  • ChatGPT-5 routes user prompts through a router before applying model-specific instructions, reversing the traditional static-prompt-first order.
  • This prevents caching of core static prompts and increases per-query compute and token burn.
Get the Snipd Podcast app to discover more snips from this episode
Get the app