AXRP - the AI X-risk Research Podcast cover image

21 - Interpretability for Engineers with Stephen Casper

AXRP - the AI X-risk Research Podcast

00:00

The Future of Interpretability Research

Robinson: Lots of interpretability research has been done at relatively small scales with humans in the loop. But there is a gap between this and like what we would really need to fix AI, he says. "I think it's probably going to be very, very important and central to highly relevant forms of interpretability"

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app