Controlling the Agent's Behavior With Extrinsic Rewards

If you let the agent make up its own reward function, then you're sort of an unknown territory, right. How do you agent there to learn to do whatever? And then that's where you get into issues of safety and, you know, kind of. Right. That doesn't mean there aren't safety concerns there as many people have reminded us but, but the safety concerns if the agent is making up its own value function, are much more severe or trying to satisfy something like, like, let's just have control.right.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app