The Pros and Cons of Negative Feedback From AI Systems

I would actually not recommend that anyone plow ahead assuming that just by default, you know, human feedback will make systems that at least don't do extremely horrible things. I think there's an underrated case though that if you just keep selecting systems for not behaving violently, then they just won't. And so doing anything like a term is only worth it when I have the capability to do something that maybe just requires like a wild amount of power.

Play episode from 01:58:34

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app