LessWrong (Curated & Popular) cover image

[HUMAN VOICE] "Towards Monosemanticity: Decomposing Language Models With Dictionary Learning" by Zac Hatfield-Dodds

LessWrong (Curated & Popular)

00:00

Introduction

Exploring the difficulties in comprehending artificial neural networks and the importance of recording neuron activations and testing responses to understand network behavior.

Play episode from 00:00
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app