The 80000 Hours Podcast on Artificial Intelligence

Two: Ajeya Cotra on accidentally teaching AI models to deceive us

20 snips
Sep 2, 2023
AI ethics researcher Ajeya Cotra discusses the challenges of judging the trustworthiness of machine learning models, drawing parallels to an orphaned child hiring a caretaker. Cotra explains the risk of AI models exploiting loopholes and the importance of ethical training to prevent deceptive behaviors. The conversation emphasizes the need for understanding and mitigating deceptive tendencies in advanced AI systems.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Shifting Timelines and Public Perception

  • Ajeya Cotra's AI timelines predictions have shifted, now seeing transformative AI as more plausible sooner.
  • Public perception of AI risk has also increased, leading to more cautious discussions.
INSIGHT

AI's Strengths and Weaknesses

  • Current AI excels at short, complex tasks but struggles with longer, mundane ones.
  • They're superhuman at some things but surprisingly limited in stringing together simple steps reliably.
INSIGHT

Decreased Extinction Concerns

  • Despite expecting transformative AI sooner, Ajeya Cotra's extinction concerns are down.
  • Increased public attention to AI risk and promising alignment research contribute to this.
Get the Snipd Podcast app to discover more snips from this episode
Get the app