How Does the Robot Work?

Every few seconds, this particular robot will have an attempt at getting the ball into the cup. Every now and then, by chance, the ball lands in the cup, and the robot is rewarded with a positive score. The reset between the training episodes where it untangles itself or flips the ball out of the cup, those are escriptive. But then when it actually tries to accomplish the task, that is a policy which it has taught itself through experience,. over time, from everything it learns in the scores it receives.

Play episode from 23:21

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app