AXRP - the AI X-risk Research Podcast cover image

17 - Training for Very High Reliability with Daniel Ziegler

AXRP - the AI X-risk Research Podcast

00:00

Is There a Gradient Barrier to Learning?

T seems like either you've got to be adding more data of, like, human labels or something. When things like this, it's actually going to be dangerous. The question is, could your normal training procedure exploit the same information? N: We sort of had the webner facethat i described earlier, where contractors can, can write some things to try to fool a classifier. But we augmented them in a few ways to make their jobs easier.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app