
"(My understanding of) What Everyone in Technical Alignment is Doing and Why" by Thomas Larsen & Eli Lifland
LessWrong (Curated & Popular)
00:00
A Paradime for Interpretability?
As model size scales to a g i, things will become ever less interpretable. If there was an example of networks becoming more interpretable as they got bigger, this would update me. I would love a new paradime for interpretability and this team seems like probably the best position to find suc a paradime.
Play episode from 23:45
Transcript


