The 80000 Hours Podcast on Artificial Intelligence cover image

Four: Rohin Shah on DeepMind and trying to fairly hear out both AI doomers and doubters

The 80000 Hours Podcast on Artificial Intelligence

CHAPTER

Mechanistic Interpretability at DeepMind

The chapter highlights the importance of mechanistic interpretability in understanding how AI systems produce their outputs. It discusses the challenges of interpretability in large language models and the potential benefits of intermediate progress in identifying failure modes. DeepMind's work in this area is also mentioned.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner