How Do You Train the AI?

In our Chocolate Bar application, the feedback channel was very easy to describe. The goal is to place the product on the outlet conveyor belt within a certain position so that the plastic bag packaging machine works in fixed time intervals. And in this way, placing the chocolate bar at the correct position on the last conveyor belt yields the highest possible reward to our reinforcement learning algorithm. In turn, the more distant the chocolate bar was placed, the lower our reward signal becomes.

Play episode from 11:48

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app