The Problem With Interpretability in Large Systems

The problem is obviously with very large systems, how do you figure out all the things that are going on inside of a neural network? Maybe you can find many of the big picture things, but it's very hard to find all the little details. Develop mental interpretability proposes that we study how structure forms over the course of training. And I think maybe it's more tractable to find out what's going on in the neural network at the end. If we just understand each individual transition over the courseof training. That might be much more tractable than trying to understand how structure is at the end of training.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app