The AI Safety Interpretability Community

I think the AI safety interpretability community is young, and it is small. Maybe that's just inevitable given its size. The relative lack of focus on intrinsic interpretability tools is one of these. And I also think they're a little bit too eager to start things up and sometimes rename them. This could lead to isolation among different researchers working on the same thing under different names.

Play episode from 42:27

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app