AXRP - the AI X-risk Research Podcast cover image

21 - Interpretability for Engineers with Stephen Casper

AXRP - the AI X-risk Research Podcast

00:00

The AI Safety Interpretability Community

I think the AI safety interpretability community is young, and it is small. Maybe that's just inevitable given its size. The relative lack of focus on intrinsic interpretability tools is one of these. And I also think they're a little bit too eager to start things up and sometimes rename them. This could lead to isolation among different researchers working on the same thing under different names.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app