The Importance of Reinforcement Learning in Reward Hacking

I'm still trying to drill down on how much I disagree with you. How much your examples are meant to display the end of the world versus just like a severe problem? RL algorithms have shown in the past that they can engage in reward hacking. There's that famous video game, but I don't know which one it was,. It spun around in a circle in some way to maximize points. That is a very different problem than doing the same thing and then waking up to this system having taken over the world.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app