AXRP - the AI X-risk Research Podcast cover image

17 - Training for Very High Reliability with Daniel Ziegler

AXRP - the AI X-risk Research Podcast

CHAPTER

Scalable Oversight

I think it's mainly, a sort of high stakeness problem. If you solve scaleable oversight and you have a perfect training signal, you can still absolutely get this problem. But conversely, if you only sort of, you know,. you do need the oversight signal to be good. And i guess, by construction, like, once you have both scaleble oversight and a low, low enough chance of make catastrophic mistakes, then, like, you're training your system on the right thing, and it like never makes a mistake. S that sound right? I think that's the hope.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner