AI-powered
podcast player
Listen to all your favourite podcasts with AI-powered features
To estimate online performance offline, user data showing recommendations and recording clicks is essential. By analyzing these clicks, one can gauge the efficacy of offline estimations in mimicking real online scenarios. Implementing bandit learning necessitates a modest action space to maximize rewards effectively. Simulation environments play a pivotal role in understanding how alterations impact learning algorithms and provide valuable insights into method functionalities.
Differentiating between recommendations and advertising lies in the payment structure: personalized services, like Spotify playlists, are paid for by users, enhancing user engagement. Conversely, advertisements target user attention but are funded by advertisers. While recommendations aim to enrich user experience, advertisements prioritize visibility and engagement for both users and sellers, leading to distinct approaches despite occasional overlaps.
Evaluating recommendation systems faces challenges in defining rewarding metrics that align with user satisfaction rather than superficial engagement like clicks. Reward engineering for long-term user delight poses an ongoing dilemma as user preferences are complex to measure. The simulation environments play a vital role in testing different models and understanding their effectiveness, bridging the gap between theoretical assumptions and practical implementation in the recommendation system landscape.
The podcast delves into a challenge where participants had to build an agent for a Rekogym setting by learning from recommendation logs to define a policy for user interactions. Teams submitted their code to compete in a simulated AB test to outperform other methods. Learners showcased two critical strategies: using variance penalization to handle bias-variance trade-offs in minimal training data sets and continuing to learn during AB testing to improve policy.
The discussion highlights the distinction between fully organic user interactions and banded feedback where the system influences user choices. While banded feedback helps optimize actions for rewards, it requires substantial data. In contrast, fully organic interactions provide unbiased data but with weaker signals. Recognizing and leveraging both feedback types in recommendation systems remains a key challenge to balance between exploiting user preferences and reducing system biases.
In episode three I am joined by Olivier Jeunen, who is a postdoctoral scientist at Amazon. Olivier obtained his PhD from University of Antwerp with his work "Offline Approaches to Recommendation with Online Success". His work concentrates on Bandits, Reinforcement Learning and Causal Inference for Recommender Systems.
We talk about methods for evaluating online performance of recommender systems in an offline fashion and based on rich logging data. These methods stem from fields like bandit theory and reinforcement learning. They heavily rely on simulators whose benefits, requirements and limitations we discuss in greater detail. We further discuss the differences between organic and bandit feedback as well as what sets recommenders apart from advertising. We also talk about the right target for optimization and receive some advice to continue livelong learning as a researcher, be it in academia or industry.
Olivier has published multiple papers at RecSys, NeurIPS, WSDM, UMAP, and WWW. He also won the RecoGym challenge with his team from University of Antwerp. With research internships at Criteo, Facebook and Spotify Research he brings significant experience to the table.
Enjoy this enriching episode of RECSPERTS - Recommender Systems Experts.
Links from this Episode:
Thesis and Papers:
General Links:
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode
Hear something you like? Tap your headphones to save it with AI-generated key takeaways
Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more
Listen to all your favourite podcasts with AI-powered features
Listen to the best highlights from the podcasts you love and dive into the full episode