

#434 - Can We Survive AI?
30 snips Sep 16, 2025
Sam Harris chats with Eliezer Yudkowsky, a leading voice in AI alignment, and Nate Soares, president of the Machine Intelligence Research Institute. They delve into their urgent concerns about superintelligent AI and its potential existential threats. The conversation ranges from the alignment problem and the unpredictability of AI behaviors to the myth of controlling advanced systems. They also contemplate the chilling analogy of an uncontrollable tiger cub and stress the need for responsible AI development and regulatory measures. A thought-provoking discussion on our future with AI!
AI Snips
Chapters
Books
Transcript
Episode notes
What Alignment Really Means
- Alignment means steering a powerful AI toward the outcomes its creators intended rather than merely getting desired outputs in tests.
- Success on observed tests can hide misaligned internal motivations that act differently when unobserved.
The Moravec Reversal
- Modern LLMs surprised many by excelling at language tasks once considered easy for humans but hard for machines.
- This reversal of the Moravec paradox changed expectations about what AI would achieve first.
Genie In A Box Didn’t Happen
- Eliezer expected airtight 'genie in a box' testing but real development connected models to the internet during training.
- In practice, companies put useful models online quickly, undermining sandbox assumptions.