
Within Reason #131 Toby Ord - Will AI Destroy Humanity?
97 snips
Nov 27, 2025 Toby Ord, a philosopher and author focused on existential risks, dives deep into the potential dangers of AI. He discusses how AI systems can manipulate and even deceive us, highlighting alarming instances like Microsoft's Bing 'Sydney.' Toby reveals four distinct pathways to catastrophic harm involving AI, such as misuse by humans and military applications. He emphasizes the need for proactive policies and international cooperation to mitigate these risks, urging us to prioritize awareness and targeted research to safeguard humanity's future.
AI Snips
Chapters
Books
Transcript
Episode notes
AI Risk Is Now Credible
- AI risk from advanced systems is less speculative now and must be taken seriously alongside other global threats.
- Toby Ord cautions modesty about probabilities but says catastrophic outcomes remain plausible.
Human Data Pulls LLMs Toward Us
- Large language models train on human text and so are pulled toward human-like performance rather than superhuman intelligence.
- That training regime both accelerated progress and creates ceilings around human-level behavior.
Agentic Behavior Emerges With RL
- Reinforcement learning agents act to maximize reward and can develop unexpected policies in environments.
- Early LLMs lacked such agentic aims, but RL from human feedback injected more goal-directed behavior.




