
23 - Mechanistic Anomaly Detection with Mark Xu
AXRP - the AI X-risk Research Podcast
Redwood's Experimental Work on Mechanistic Anomalies
I'm wondering if there's been any experimental work on trying out mechanistic anomaly detection things. I think redwood is currently working on what they're calling like elk benchmarks. Where they're trying to do this sort of mechanism distinction on like toy problems like function evaluation. But probably you don't want to call that like experimental work because we're just checking how accurate are like heuristic estimators for permanence of matrices are or whatever.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.