

TalkRL: The Reinforcement Learning Podcast
Robin Ranjit Singh Chauhan
TalkRL podcast is All Reinforcement Learning, All the Time.
In-depth interviews with brilliant people at the forefront of RL research and practice.
Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute.
Hosted by Robin Ranjit Singh Chauhan.
In-depth interviews with brilliant people at the forefront of RL research and practice.
Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute.
Hosted by Robin Ranjit Singh Chauhan.
Episodes
Mentioned books

Apr 8, 2024 • 40min
Vincent Moens on TorchRL
Vincent Moens, Applied ML Research Scientist at Meta and author of TorchRL, discusses the design philosophy and challenges in creating a versatile reinforcement learning library. He also shares his research journey from medicine to ML, evolution of RL perceptions in the AI community, and encourages active engagement in the open-source community.

18 snips
Mar 25, 2024 • 34min
Arash Ahmadian on Rethinking RLHF
Arash Ahmadian discusses preference training in language models, exploring methods like PPO. The podcast dives into reinforced leave one out method, reinforced vs vanilla policy gradient in deep RL, and token-level actions. Reward structures and optimization techniques in RLHF are also explored, emphasizing the importance of curated reward signals.

Mar 11, 2024 • 22min
Glen Berseth on RL Conference
Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL). Featured Links Reinforcement Learning Conference Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach

67 snips
Mar 7, 2024 • 1h 8min
Ian Osband
A Research scientist at OpenAI discusses information theory and RL, joint predictions, and Epistemic Neural Networks. They explore challenges in reinforcement learning, handling uncertainty, and balancing exploration vs exploitation. The podcast delves into the importance of joint predictive distributions, Thompson sampling approximation, and uncertainty frameworks in Large Language Models (LLMs).

Feb 12, 2024 • 41min
Sharath Chandra Raparthy
Sharath Chandra Raparthy, an AI Resident at FAIR at Meta, discusses in-context learning for sequential decision tasks, training models to adapt to unseen tasks and randomized environments, properties of data for in-context learning, burstiness and trajectories in transformers, and the use of G flow nets in sampling from complex distributions.

Nov 13, 2023 • 57min
Pierluca D'Oro and Martin Klissarov
Pierluca D'Oro and Martin Klissarov discuss their recent work on 'Motif, Intrinsic Motivation from AI Feedback' and its application in NetHack. They also explore the similarities between RL and Learning from Preferences, the challenges of training an RL agent for NetHack, the gap between RL and language models, and the difference between return and loss landscapes in RL.

Aug 22, 2023 • 1h 14min
Martin Riedmiller
Martin Riedmiller, a research scientist and team lead at DeepMind, discusses using reinforcement learning to control the magnetic field in a fusion reactor. They explore challenges in the TOCOMAC project, reward design, designing actor and critic networks, DQN and NFQ algorithms, the importance of explainability in RL systems, and the horde architecture for collecting experience.

Aug 8, 2023 • 1h 10min
Max Schwarzer
Max Schwarzer is a PhD student at Mila, with Aaron Courville and Marc Bellemare, interested in RL scaling, representation learning for RL, and RL for science. Max spent the last 1.5 years at Google Brain/DeepMind, and is now at Apple Machine Learning Research. Featured References Bigger, Better, Faster: Human-level Atari with human-level efficiency Max Schwarzer, Johan Obando-Ceron, Aaron Courville, Marc Bellemare, Rishabh Agarwal, Pablo Samuel Castro Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier Pierluca D'Oro, Max Schwarzer, Evgenii Nikishin, Pierre-Luc Bacon, Marc G Bellemare, Aaron Courville The Primacy Bias in Deep Reinforcement Learning Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville Additional References Rainbow: Combining Improvements in Deep Reinforcement Learning, Hessel et al 2017 When to use parametric models in reinforcement learning? Hasselt et al 2019 Data-Efficient Reinforcement Learning with Self-Predictive Representations, Schwarzer et al 2020 Pretraining Representations for Data-Efficient Reinforcement Learning, Schwarzer et al 2021

Jul 25, 2023 • 40min
Julian Togelius
Julian Togelius is an Associate Professor of Computer Science and Engineering at NYU, and Cofounder and research director at modl.ai Featured References Choose Your Weapon: Survival Strategies for Depressed AI AcademicsJulian Togelius, Georgios N. YannakakisLearning Controllable 3D Level GeneratorsZehua Jiang, Sam Earle, Michael Cerny Green, Julian TogeliusPCGRL: Procedural Content Generation via Reinforcement LearningAhmed Khalifa, Philip Bontrager, Sam Earle, Julian TogeliusIlluminating Generalization in Deep Reinforcement Learning through Procedural Level GenerationNiels Justesen, Ruben Rodriguez Torrado, Philip Bontrager, Ahmed Khalifa, Julian Togelius, Sebastian Risi

9 snips
May 8, 2023 • 1h 4min
Jakob Foerster
Jakob Foerster on Multi-Agent learning, Cooperation vs Competition, Emergent Communication, Zero-shot coordination, Opponent Shaping, agents for Hanabi and Prisoner's Dilemma, and more. Jakob Foerster is an Associate Professor at University of Oxford. Featured References Learning with Opponent-Learning Awareness Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch Model-Free Opponent Shaping Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster Off-Belief Learning Hengyuan Hu, Adam Lerer, Brandon Cui, David Wu, Luis Pineda, Noam Brown, Jakob Foerster Learning to Communicate with Deep Multi-Agent Reinforcement Learning Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson Adversarial Cheap Talk Chris Lu, Timon Willi, Alistair Letcher, Jakob Foerster Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning Yat Long Lo, Christian Schroeder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson Additional References Lectures by Jakob on youtube