
17 - Training for Very High Reliability with Daniel Ziegler
AXRP - the AI X-risk Research Podcast
Scalable Oversight
I think it's mainly, a sort of high stakeness problem. If you solve scaleable oversight and you have a perfect training signal, you can still absolutely get this problem. But conversely, if you only sort of, you know,. you do need the oversight signal to be good. And i guess, by construction, like, once you have both scaleble oversight and a low, low enough chance of make catastrophic mistakes, then, like, you're training your system on the right thing, and it like never makes a mistake. S that sound right? I think that's the hope.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.