Future of Life Institute Podcast cover image

Dan Hendrycks on Catastrophic AI Risks

Future of Life Institute Podcast

CHAPTER

Representation Engineering versus Mechanistic Interpretability

This chapter explores the differences between representation engineering and mechanistic interpretability in understanding AI systems. It discusses the advantages of a top-down approach and emphasizes the need to understand high-level emergent behavior in models.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner