Future of Life Institute Podcast cover image

Joe Carlsmith on How We Change Our Minds About AI Risk

Future of Life Institute Podcast

00:00

The Dark Side of Interpretability Research

Is it perhaps dangerous to experiment with trying to get empirical data on these behaviors in AI systems? I'm thinking somewhat analogously to perhaps gain a function research in viruses. We do nevertheless need to be finding ways to understand these behaviors, elicit them in safe ways and kind of learn how they can be addressed.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app