AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
The Future of Alignment Research
I get the sense that alias or just isn't super on board with any of them, and they have all that they all have a bunch of kind of obvious failure modes. I don't feel like anyone has proposed something that's like, yes, this approach could really work. Work I'm excited about is one of them one of the areas is just interpretability or like mechanistic interpretability also anomaly detection. Paul Christiano is like doing a bunch of stuff at arc. That I think it's pretty interesting. Yeah, we're ready for sure.