FUTURATI PODCAST cover image

Ep. 129: Applying the 'security mindset' to AI and x-risk | Jeffrey Ladish

FUTURATI PODCAST

00:00

The Future of Alignment Research

I get the sense that alias or just isn't super on board with any of them, and they have all that they all have a bunch of kind of obvious failure modes. I don't feel like anyone has proposed something that's like, yes, this approach could really work. Work I'm excited about is one of them one of the areas is just interpretability or like mechanistic interpretability also anomaly detection. Paul Christiano is like doing a bunch of stuff at arc. That I think it's pretty interesting. Yeah, we're ready for sure.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app