AXRP - the AI X-risk Research Podcast cover image

17 - Training for Very High Reliability with Daniel Ziegler

AXRP - the AI X-risk Research Podcast

00:00

Is There Any Research on Scalable Oversight?

The team at redwood is working on a set of tasks that can be defined just by simple agrithmic predicates. They hope to figure out which kinds of adversary attacks and training techniques work really well in that setting. And then they'll hope to scale back up to more sophisticated tasks.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app