Future of Life Institute Podcast cover image

Joe Carlsmith on How We Change Our Minds About AI Risk

Future of Life Institute Podcast

CHAPTER

The Dark Side of Interpretability Research

Is it perhaps dangerous to experiment with trying to get empirical data on these behaviors in AI systems? I'm thinking somewhat analogously to perhaps gain a function research in viruses. We do nevertheless need to be finding ways to understand these behaviors, elicit them in safe ways and kind of learn how they can be addressed.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner