
Neel Nanda on Avoiding an AI Catastrophe with Mechanistic Interpretability
Future of Life Institute Podcast
Toy Models of Superposition From Anthropic
A paper called toy models of superposition from anthropic is pretty exciting. They found that in a toy model they created not only did it learn to use superposition but that it learned these beautiful geometric configurations. And the final work I want to highlight is this work from opening eye called multimodal neurons in artificial neural networks. For example, a drawing of Spiderman, the name Peter Parker and like a picture ofSpiderman, the same neuron lights up,. So it suggests there's some real abstraction going on.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.