
Don't Worry About the Vase Podcast ChatGPT 5.1 Codex Max
Nov 25, 2025
Zvi Moshowitz hosts a compelling discussion with two insightful contributors who dive deep into the capabilities of Codex Max. They analyze the system card's findings, highlighting its strengths and weaknesses, particularly the surprising mental-health benchmark. The conversation also covers sandboxing risks, various cybersecurity evaluations, and significant advancements in self-improvement metrics for AI. With fascinating insights on biological threats and the future of software engineering, listeners gain a comprehensive view of this evolving technology.
AI Snips
Chapters
Transcript
Episode notes
Codex Max Is A Notable Capability Jump
- GPT 5.1 Codex Max is presented as a faster, more capable coding model with better persistence on long tasks.
- It sits as a new high on capability charts but still below extreme forecasted trajectories.
Engineered For Long, Agentic Coding Tasks
- OpenAI trained Codex Max on agentic software engineering tasks and on 'compaction' to work across millions of tokens.
- The model is explicitly aimed at automated software engineering and long-context coherence.
Limit Network Access In Agent Runs
- Keep network access disabled by default and carefully review outputs before enabling external sites.
- Limit the model to trusted domains and safe HTTP methods to reduce prompt injection and credential leakage risks.
