Scalable Oversight

I think it's mainly, a sort of high stakeness problem. If you solve scaleable oversight and you have a perfect training signal, you can still absolutely get this problem. But conversely, if you only sort of, you know,. you do need the oversight signal to be good. And i guess, by construction, like, once you have both scaleble oversight and a low, low enough chance of make catastrophic mistakes, then, like, you're training your system on the right thing, and it like never makes a mistake. S that sound right? I think that's the hope.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app