AXRP - the AI X-risk Research Podcast cover image

9 - Finite Factored Sets with Scott Garrabrant

AXRP - the AI X-risk Research Podcast

00:00

Inferring Which Neural Network Concepts Are Good

There's some work about trying to figure out if concepts in neural networks are on the neurons, or whether they are linear combinations of neurons. Yet it strikes me that like, inferring which s are good is a related but different problem to inferring which concepts a system is using. So i guess i'm concretely point the picture of like, factorization into nerons in the result of a learned system might be similar to group yet interesting in the like, like, people whave definitely thought about this problem. But  the oltha, like work on it seems kind of hacpy to me.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app