80,000 Hours Podcast cover image

80,000 Hours Podcast

#151 – Ajeya Cotra on accidentally teaching AI models to deceive us

May 12, 2023
Ajeya Cotra, a Senior Research Analyst at Open Philanthropy with expertise in AI alignment, explores the intricate relationship between humans and artificial intelligence. She likens training AI to an orphaned child hiring a guardian, pointing out the risks of deception and misalignment. The discussion includes the evolving capabilities of AI, the nuances of situational awareness, and the ethical complexities in AI's decision-making. Cotra emphasizes the need for responsible oversight and innovative training to ensure AI models align with human values.
02:49:40

Episode guests

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner