80,000 Hours Podcast cover image

80,000 Hours Podcast

#159 – Jan Leike on OpenAI's massive push to make superintelligence safe in 4 years or less

Aug 7, 2023
Jan Leike, Head of Alignment at OpenAI and leader of the Superalignment project, discusses the ambitious goal of safely developing superintelligent AI within four years. He addresses the challenges of aligning AI with human values and the importance of Reinforcement Learning from Human Feedback (RLHF). Leike expresses guarded optimism about finding solutions to steer AI safely, emphasizing collaboration and innovative approaches in tackling these complex issues. The conversation also highlights recruitment efforts to build a team for this critical initiative.
02:51:20

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • The Superalignment project aims to make superintelligent AI systems aligned and safe to use within four years, addressing potential risks and preventing the disempowerment of humanity.
  • The project focuses on scalable oversight and generalization, developing methods to train AI systems to find and evaluate bugs in code or other tasks, and improving generalization to ensure AI models align with human intent in complex situations.

Deep dives

Automating Alignment Research

The goal of the super alignment project is to automate alignment research for AI systems. While aligning super intelligent AI systems may be a difficult problem, the project aims to align the next generation of AI systems that are closer to human-level capabilities. By making progress on aligning these more achievable systems, they can be used to solve the alignment problem for even more advanced systems in the future.

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner