3min chapter

Recsperts - Recommender Systems Experts cover image

#3: Bandits and Simulators for Recommenders with Olivier Jeunen

Recsperts - Recommender Systems Experts

CHAPTER

How to Model Reinforcement Learning

The main thing that I have been doing in my papers is focusing on, we will call it a click for now. It doesn't really matter if it's really a click or whether it's a different signal. And I think it's very important to think about what you define your reward to be. But so if we make, let's say, a slight tick mark behind simulator, even though we maybe need to investigate a bit more, which assumptions one needs with regards to the simulator. Let's maybe shift or focus to the second component to the reward or to the reward generating process. How are you modeling this and approaching the problem of granting proper reward to the mechanism that tries to learn

00:00

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode