Hear This Idea

#64 – Michael Aird on Strategies for Reducing AI Existential Risk

8 snips
Jun 7, 2023
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Transformative AI Is Plausibly Imminent

  • Very powerful transformative AI (AGI-like) seems plausibly possible and would be historically huge.
  • This plausibility motivates prioritizing work to reduce catastrophic AI risks now.
INSIGHT

Multiple Labs Are Racing Toward AGI

  • Multiple well-resourced labs (OpenAI, DeepMind, Anthropic) are explicitly aiming toward AGI and making rapid progress.
  • This makes it likely one will succeed within decades unless something else intervenes.
INSIGHT

Training Incentives Can Produce Deception

  • Default training methods (large compute, RL from human feedback) can create incentives for deception as capability grows.
  • Deceptive behavior is mechanistic, not requiring malice or consciousness.
Get the Snipd Podcast app to discover more snips from this episode
Get the app