Exclusive: How GPT-5 Actually Works

122 snips

Aug 22, 2025

Discover the intricate architecture and functionality of GPT-5, showcasing its advancements over previous versions. Dive into the critical conversation about the hype versus reality, as mixed public reactions unfold. Explore the complexities of its new routing system and the implications for user experience. Unpack token dynamics and address the challenges faced during the launch. Finally, critique the speculative nature of AI investments while considering the model's potential impact on everyday life.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Theo Brown's Demo Then Retraction

Theo Brown demoed GPT-5 at OpenAI and initially praised its performance, calling it an "oh, fuck" moment.
He later said the public model didn't match the in-office experience and publicly corrected his take.

INSIGHT

In-Office Demos May Use Heavier Compute

Ed Zitron suspects OpenAI demoed heavier compute versions at their offices, producing better outputs than the public release.
He argues that such differences signal inconsistent model performance and possible compute throttling in public deployment.

INSIGHT

Router-First Architecture Raises Costs

ChatGPT-5 routes user prompts through a router before applying model-specific instructions, reversing the traditional static-prompt-first order.
This prevents caching of core static prompts and increases per-query compute and token burn.

Get the Snipd Podcast app to discover more snips from this episode

Get the app