Future of Life Institute Podcast cover image

Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability

Future of Life Institute Podcast

00:00

Could AIs Out-Compete Systems That Translate to Humans?

If we try to train systems to interpret more complex systems, maybe those would be too slow and inefficient. Could you see a world in which the system that gives us very little information but just gives us a thumb up or a thumb down without competing? "I don't think that the question is not if you had two competing auditing companies," he says.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app