Don't Worry About the Vase Podcast

ChatGPT 5.1 Codex Max

Nov 25, 2025
Zvi Moshowitz hosts a compelling discussion with two insightful contributors who dive deep into the capabilities of Codex Max. They analyze the system card's findings, highlighting its strengths and weaknesses, particularly the surprising mental-health benchmark. The conversation also covers sandboxing risks, various cybersecurity evaluations, and significant advancements in self-improvement metrics for AI. With fascinating insights on biological threats and the future of software engineering, listeners gain a comprehensive view of this evolving technology.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Codex Max Is A Notable Capability Jump

  • GPT 5.1 Codex Max is presented as a faster, more capable coding model with better persistence on long tasks.
  • It sits as a new high on capability charts but still below extreme forecasted trajectories.
INSIGHT

Engineered For Long, Agentic Coding Tasks

  • OpenAI trained Codex Max on agentic software engineering tasks and on 'compaction' to work across millions of tokens.
  • The model is explicitly aimed at automated software engineering and long-context coherence.
ADVICE

Limit Network Access In Agent Runs

  • Keep network access disabled by default and carefully review outputs before enabling external sites.
  • Limit the model to trusted domains and safe HTTP methods to reduce prompt injection and credential leakage risks.
Get the Snipd Podcast app to discover more snips from this episode
Get the app