The Fear of Stopping Machine Learning Systems

I'm not in search of the magical medical safe reward function I agree to almost everybody else with everybody else that yeah you will not find it humans are fadable they will not know what they want. The only approach you can take is to specify one which is reasonably safe then when you see a mistake stop the computer and adjust it. When you see another thing that you don't like stop the computer again and adjust it as part of your AI safety critters. Even a simple queue learner, even the simplest possible reinforcement learner will actually consider that it might not obey you if you change it.

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app