AXRP - the AI X-risk Research Podcast cover image

23 - Mechanistic Anomaly Detection with Mark Xu

AXRP - the AI X-risk Research Podcast

00:00

The Mechanisms of Manipulation of Noise

The thing I think I'm not getting is like this idea of like manipulating the noise, which seems like a model dependent thing. So just because your AI wanted to make there be a diamond does not imply that the like particular action it took will in fact make it be there's a diamond. And so you still have to talk about  the particular mechanism of action for the like particularaction your AI decided.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app