Recsperts - Recommender Systems Experts cover image

#3: Bandits and Simulators for Recommenders with Olivier Jeunen

Recsperts - Recommender Systems Experts

CHAPTER

How to Model Reinforcement Learning

The main thing that I have been doing in my papers is focusing on, we will call it a click for now. It doesn't really matter if it's really a click or whether it's a different signal. And I think it's very important to think about what you define your reward to be. But so if we make, let's say, a slight tick mark behind simulator, even though we maybe need to investigate a bit more, which assumptions one needs with regards to the simulator. Let's maybe shift or focus to the second component to the reward or to the reward generating process. How are you modeling this and approaching the problem of granting proper reward to the mechanism that tries to learn

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner