
Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability
Future of Life Institute Podcast
Could AIs Out-Compete Systems That Translate to Humans?
If we try to train systems to interpret more complex systems, maybe those would be too slow and inefficient. Could you see a world in which the system that gives us very little information but just gives us a thumb up or a thumb down without competing? "I don't think that the question is not if you had two competing auditing companies," he says.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.