TalkRL: The Reinforcement Learning Podcast

Pierluca D'Oro and Martin Klissarov

Nov 13, 2023
Pierluca D'Oro and Martin Klissarov discuss their recent work on 'Motif, Intrinsic Motivation from AI Feedback' and its application in NetHack. They also explore the similarities between RL and Learning from Preferences, the challenges of training an RL agent for NetHack, the gap between RL and language models, and the difference between return and loss landscapes in RL.
Ask episode
Chapters
Transcript
Episode notes