
#3: Bandits and Simulators for Recommenders with Olivier Jeunen
Recsperts - Recommender Systems Experts
How to Model Reinforcement Learning
The main thing that I have been doing in my papers is focusing on, we will call it a click for now. It doesn't really matter if it's really a click or whether it's a different signal. And I think it's very important to think about what you define your reward to be. But so if we make, let's say, a slight tick mark behind simulator, even though we maybe need to investigate a bit more, which assumptions one needs with regards to the simulator. Let's maybe shift or focus to the second component to the reward or to the reward generating process. How are you modeling this and approaching the problem of granting proper reward to the mechanism that tries to learn
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.