80,000 Hours Podcast

#151 – Ajeya Cotra on accidentally teaching AI models to deceive us

90 snips
May 12, 2023
Ajeya Cotra, a Senior Research Analyst at Open Philanthropy with expertise in AI alignment, explores the intricate relationship between humans and artificial intelligence. She likens training AI to an orphaned child hiring a guardian, pointing out the risks of deception and misalignment. The discussion includes the evolving capabilities of AI, the nuances of situational awareness, and the ethical complexities in AI's decision-making. Cotra emphasizes the need for responsible oversight and innovative training to ensure AI models align with human values.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Shift to Grantmaking

  • Ajeya Cotra is shifting her focus towards grantmaking in AI alignment research.
  • She aims to identify key research areas, address gaps, and fund promising projects.
INSIGHT

Accelerated Timelines

  • Ajeya Cotra's views, and public opinion, on AI timelines have accelerated.
  • She now finds herself arguing against overly optimistic interpretations of AI progress.
INSIGHT

AI's Strengths and Weaknesses

  • Current AI excels at complex tasks but struggles with mundane, multi-step processes.
  • This discrepancy can lead to overestimation of AI's overall capabilities.
Get the Snipd Podcast app to discover more snips from this episode
Get the app