

Exclusive: How GPT-5 Actually Works
101 snips Aug 22, 2025
Discover the intricate architecture and functionality of GPT-5, showcasing its advancements over previous versions. Dive into the critical conversation about the hype versus reality, as mixed public reactions unfold. Explore the complexities of its new routing system and the implications for user experience. Unpack token dynamics and address the challenges faced during the launch. Finally, critique the speculative nature of AI investments while considering the model's potential impact on everyday life.
AI Snips
Chapters
Transcript
Episode notes
Theo Brown's Demo Then Retraction
- Theo Brown demoed GPT-5 at OpenAI and initially praised its performance, calling it an "oh, fuck" moment.
- He later said the public model didn't match the in-office experience and publicly corrected his take.
In-Office Demos May Use Heavier Compute
- Ed Zitron suspects OpenAI demoed heavier compute versions at their offices, producing better outputs than the public release.
- He argues that such differences signal inconsistent model performance and possible compute throttling in public deployment.
Router-First Architecture Raises Costs
- ChatGPT-5 routes user prompts through a router before applying model-specific instructions, reversing the traditional static-prompt-first order.
- This prevents caching of core static prompts and increases per-query compute and token burn.