
Pierluca D'Oro
PhD student at Mila and visiting researcher at Meta.
Best podcasts with Pierluca D'Oro
Ranked by the Snipd community

Nov 13, 2023 • 57min
Pierluca D'Oro and Martin Klissarov
Pierluca D'Oro and Martin Klissarov discuss their recent work on 'Motif, Intrinsic Motivation from AI Feedback' and its application in NetHack. They also explore the similarities between RL and Learning from Preferences, the challenges of training an RL agent for NetHack, the gap between RL and language models, and the difference between return and loss landscapes in RL.