ChatGPT 5.1 Codex Max

Nov 25, 2025

Zvi Moshowitz hosts a compelling discussion with two insightful contributors who dive deep into the capabilities of Codex Max. They analyze the system card's findings, highlighting its strengths and weaknesses, particularly the surprising mental-health benchmark. The conversation also covers sandboxing risks, various cybersecurity evaluations, and significant advancements in self-improvement metrics for AI. With fascinating insights on biological threats and the future of software engineering, listeners gain a comprehensive view of this evolving technology.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Codex Max Is A Notable Capability Jump

GPT 5.1 Codex Max is presented as a faster, more capable coding model with better persistence on long tasks.
It sits as a new high on capability charts but still below extreme forecasted trajectories.

INSIGHT

Engineered For Long, Agentic Coding Tasks

OpenAI trained Codex Max on agentic software engineering tasks and on 'compaction' to work across millions of tokens.
The model is explicitly aimed at automated software engineering and long-context coherence.

ADVICE

Limit Network Access In Agent Runs

Keep network access disabled by default and carefully review outputs before enabling external sites.
Limit the model to trusted domains and safe HTTP methods to reduce prompt injection and credential leakage risks.

Get the Snipd Podcast app to discover more snips from this episode

Get the app