Recsperts - Recommender Systems Experts cover image

#3: Bandits and Simulators for Recommenders with Olivier Jeunen

Recsperts - Recommender Systems Experts

CHAPTER

Reward Engineering: An Open Problem for the Rexel Space

In reinforcement learning you have a certain plan to get to a reward. In banded learning, the reward is going to be somewhat instantaneously. We don't want to just optimize for short-term results and increase clickbaiting effect. For long-term rewards, I would say that reinforcement learning is the main tool that you need to use.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner