
23 - Mechanistic Anomaly Detection with Mark Xu
AXRP - the AI X-risk Research Podcast
Heuristic Arguments for Mechanistic Anomalies
The hope is that like, this is the kind of object that we'll need to be able to do mechanistic anomaly detection. Yeah. Or the hope is to just use the heuristic estimator plus heuristic argument in like all of the schemes that I was describing previously. And then you ask whether or not there exists any heuristic argument that explains 99.9% of the variance of your model but does not explain this particular data point off distribution. That's kind of weird. No good. Right.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.