
How Language Models Actually Think
The Data Exchange with Ben Lorica
00:00
Finding and Naming Concepts Automatically
Emmanuel explains automated methods to discover concept neuron groups and validate them by intervention.
Play episode from 16:47
Transcript


