Recsperts - Recommender Systems Experts cover image

#3: Bandits and Simulators for Recommenders with Olivier Jeunen

Recsperts - Recommender Systems Experts

00:00

Reward Engineering: An Open Problem for the Rexel Space

In reinforcement learning you have a certain plan to get to a reward. In banded learning, the reward is going to be somewhat instantaneously. We don't want to just optimize for short-term results and increase clickbaiting effect. For long-term rewards, I would say that reinforcement learning is the main tool that you need to use.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app