AXRP - the AI X-risk Research Podcast cover image

23 - Mechanistic Anomaly Detection with Mark Xu

AXRP - the AI X-risk Research Podcast

CHAPTER

Heuristic Arguments for Mechanistic Anomalies

The hope is that like, this is the kind of object that we'll need to be able to do mechanistic anomaly detection. Yeah. Or the hope is to just use the heuristic estimator plus heuristic argument in like all of the schemes that I was describing previously. And then you ask whether or not there exists any heuristic argument that explains 99.9% of the variance of your model but does not explain this particular data point off distribution. That's kind of weird. No good. Right.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner