
23 - Mechanistic Anomaly Detection with Mark Xu
AXRP - the AI X-risk Research Podcast
The Importance of Mechanistic Anomaly Detection
I think often during this process I want to make very unrealistic assumptions about or not so maybe we can suppose that you like know how to do this division or like the reasoning is in some sense transparent to you. okay so in that setup so you have to say I and it's like got this like goal planning distinction where parts of its head are like figuring out like if I like what are things I could do and part of it said it's like well if I did that thing how good it would be sure. yeah I mean I'm sympathetic to that broad move but in this case like if you care about detecting anomalies in the like goal evaluation part but not in the like plan generation
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.