Latent Space: The AI Engineer Podcast

Scaling Test Time Compute to Multi-Agent Civilizations — Noam Brown, OpenAI

1185 snips
Jun 19, 2025
Noam Brown, who leads the multi-agent team at OpenAI, shares insights from his groundbreaking work in AI, especially in competitive strategy games like poker and Diplomacy. He discusses the fascinating impact of AI on human gameplay and critiques the constraints of the System 1/2 thinking model in AI reasoning. The conversation also touches on the challenges of test-time compute limitations, multi-agent intelligence, and innovative applications of AI tools like Codex and Windsurf, while pondering the future of AI civilizations.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
00:00 / 00:00

Noam’s Diplomacy Journey

  • Noam Brown improved his Diplomacy gameplay by deeply studying the game and learning from his AI bot Cicero's unique moves.
  • This process helped him win the 2025 World Diplomacy Championship several years after releasing Cicero.
00:00 / 00:00

LLM Advances Enable Realistic Bots

  • Early Diplomacy bots struggled with language quality causing hallucinations and inconsistencies.
  • Modern large language models now pass the Turing test, making bots much harder to distinguish from humans.
00:00 / 00:00

System 2 Needs Strong System 1

  • Reasoning (System 2) thinking benefits only emerge after models reach sufficient System 1 capabilities.
  • Early small models showed little lift from chain-of-thought prompting compared to bigger models.
Get the Snipd Podcast app to discover more snips from this episode
Get the app