Is There Any Research on Scalable Oversight?

The team at redwood is working on a set of tasks that can be defined just by simple agrithmic predicates. They hope to figure out which kinds of adversary attacks and training techniques work really well in that setting. And then they'll hope to scale back up to more sophisticated tasks.

Play episode from 53:54

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app