EA Talks cover image

SERI 2022: AI alignment and Redwood Research | Buck Shlegeris (CTO)

EA Talks

00:00

The Long Term Deployment Only Failures Problem

Adversary training is trying to solve that second problem, where the system does bad things in deployment, but not during training. Redwood are currently working on building tools for adversarial training for current systems. They hope their work can be generalized so we don't need such a tool in two years time.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app