
TalkRL: The Reinforcement Learning Podcast
TalkRL podcast is All Reinforcement Learning, All the Time.
In-depth interviews with brilliant people at the forefront of RL research and practice.
Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute.
Hosted by Robin Ranjit Singh Chauhan.
Latest episodes

Oct 10, 2019 • 57min
Pablo Samuel Castro
Dr Pablo Samuel Castro is a Staff Research Software Engineer at Google Brain. He is the main author of the Dopamine RL framework. Featured References A Comparative Analysis of Expected and Distributional Reinforcement Learning Clare Lyle, Pablo Samuel Castro, Marc G. Bellemare A Geometric Perspective on Optimal Representations for Reinforcement Learning Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle Dopamine: A Research Framework for Deep Reinforcement Learning Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, Marc G. Bellemare Dopamine RL framework on github Tensorflow Agents on github Additional References Using Linear Programming for Bayesian Exploration in Markov Decision Processes Pablo Samuel Castro, Doina Precup Using bisimulation for policy transfer in MDPs Pablo Samuel Castro, Doina Precup Rainbow: Combining Improvements in Deep Reinforcement Learning Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver Implicit Quantile Networks for Distributional Reinforcement Learning Will Dabney, Georg Ostrovski, David Silver, Rémi Munos A Distributional Perspective on Reinforcement Learning Marc G. Bellemare, Will Dabney, Rémi Munos

Sep 21, 2019 • 1h 26min
Kamyar Azizzadenesheli
Dr. Kamyar Azizzadenesheli is a post-doctorate scholar at Caltech. His research interest is mainly in the area of Machine Learning, from theory to practice, with the main focus in Reinforcement Learning. He will be joining Purdue University as an Assistant CS Professor in Fall 2020. Featured References Efficient Exploration through Bayesian Deep Q-Networks Kamyar Azizzadenesheli, Animashree Anandkumar Surprising Negative Results for Generative Adversarial Tree Search Kamyar Azizzadenesheli, Brandon Yang, Weitang Liu, Zachary C Lipton, Animashree Anandkumar Maybe a few considerations in Reinforcement Learning Research? Kamyar Azizzadenesheli Additional References Model-Based Reinforcement Learning for Atari Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski Near-optimal Regret Bounds for Reinforcement Learning Thomas Jaksch, Ronald Ortner, Peter Auer Curious Model-Building Control Systems Jürgen Schmidhuber Rainbow: Combining Improvements in Deep Reinforcement Learning Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics Ken Kansky, Tom Silver, David A. Mély, Mohamed Eldawy, Miguel Lázaro-Gredilla, Xinghua Lou, Nimrod Dorfman, Szymon Sidor, Scott Phoenix, Dileep George Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis

Sep 5, 2019 • 35min
Antonin Raffin and Ashley Hill
Antonin Raffin is a researcher at the German Aerospace Center (DLR) in Munich, working in the Institute of Robotics and Mechatronics. His research is on using machine learning for controlling real robots (because simulation is not enough), with a particular interest for reinforcement learning. Ashley Hill is doing his thesis on improving control algorithms using machine learning for real time gain tuning. He works mainly with neuroevolution, genetic algorithms, and of course reinforcement learning, applied to mobile robots. He holds a masters degree in Machine learning, and a bachelors in Computer science from the Université Paris-Saclay. Featured References stable-baselines on github Ashley Hill, Antonin Raffin primary authors. S-RL Toolbox Antonin Raffin, Ashley Hill, René Traoré, Timothée Lesort, Natalia Díaz-Rodríguez, David Filliat Decoupling feature extraction from policy learning: assessing benefits of state representation learning in goal based robotics Antonin Raffin, Ashley Hill, René Traoré, Timothée Lesort, Natalia Díaz-Rodríguez, David Filliat Additional References Learning to Drive Smoothly in Minutes, Antonin Raffin Multimodal SRL (best paper at ICRA): Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks, Michelle A. Lee, Yuke Zhu, Krishnan Srinivasan, Parth Shah, Silvio Savarese, Li Fei-Fei, Animesh Garg, Jeannette Bohg Benchmarking Model-Based Reinforcement Learning, Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba TossingBot: Learning to Throw Arbitrary Objects with Residual Physics Andy Zeng, Shuran Song, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser Stable Baselines roadmap OpenAI baselines stable-baselines github pull request

Aug 23, 2019 • 1h 12min
Michael Littman
Michael L Littman is a professor of Computer Science at Brown University. He was elected ACM Fellow in 2018 "For contributions to the design and analysis of sequential decision making algorithms in artificial intelligence". Featured References Convergent Actor Critic by Humans James MacGlashan, Michael L. Littman, David L. Roberts, Robert Tyler Loftin, Bei Peng, Matthew E. Taylor People teach with rewards and punishments as communication, not reinforcements Mark Ho, Fiery Cushman, Michael L. Littman, Joseph Austerweil Theory of Minds: Understanding Behavior in Groups Through Inverse Planning Michael Shum, Max Kleiman-Weiner, Michael L. Littman, Joshua B. Tenenbaum Personalized education at scale Saarinen, Cater, Littman Additional References Michael Littman papers on Google Scholar, Semantic Scholar Reinforcement Learning on Udacity, Charles Isbell, Michael Littman, Chris Pryby Machine Learning on Udacity, Michael Littman, Charles Isbell, Pushkar Kolhe Temporal Difference Learning and TD-Gammon, Gerald Tesauro Playing Atari with Deep Reinforcement Learning, Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller Ask Me Anything about MOOCs, D Fisher, C Isbell, ML Littman, M Wollowski, et al Reinforcement Learning and Decision Making (RLDM) Conference Algorithms for Sequential Decision Making, Michael Littman's Thesis Machine Learning A Cappella - Overfitting Thriller!, Michael Littman and Charles Isbell feat Infinite Harmony Turbotax Ad 2016: Genius Anna/Michael Littman

Aug 9, 2019 • 50min
Natasha Jaques
Natasha Jaques is a PhD candidate at MIT working on affective and social intelligence. She has interned with DeepMind and Google Brain, and was an OpenAI Scholars mentor. Her paper “Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning” received an honourable mention for best paper at ICML 2019. Featured References Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas Tackling climate change with Machine Learning David Rolnick, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D. Sherwin, S. Karthik Mukkavilli, Konrad P. Kording, Carla Gomes, Andrew Y. Ng, Demis Hassabis, John C. Platt, Felix Creutzig, Jennifer Chayes, Yoshua Bengio Additional References MIT Media Lab Flight Offsets, Caroline Jaffe, Juliana Cherston, Natasha Jaques Modeling Others using Oneself in Multi-Agent Reinforcement Learning, Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus Inequity aversion improves cooperation in intertemporal social dilemmas, Edward Hughes, Joel Z. Leibo, Matthew G. Phillips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, Thore Graepel Sequential Social Dilemma Games on github, Eugene Vinitsky, Natasha Jaques AI Alignment newsletter, Rohin Shah Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions, Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley The social function of intellect, Nicholas Humphrey Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research, Joel Z. Leibo, Edward Hughes, Marc Lanctot, Thore Graepel A Recipe for Training Neural Networks, Andrej Karpathy Emotionally Adaptive Intelligent Tutoring Systems using POMDPs, Natasha Jaques Sapiens, Yuval Noah Harari

Aug 1, 2019 • 2min
About TalkRL Podcast: All Reinforcement Learning, All the Time
August 2, 2019 Transcript The idea with TalkRL Podcast is to hear from brilliant folks from across the world of Reinforcement Learning, both research and applications. As much as possible, I want to hear from them in their own language. I try to get to know as much as I can about their work before hand. And Im not here to convert anyone, I want to reach people who are already into RL. So we wont stop to explain what a value function is, for example. Though we also wont assume everyone has read the very latest papers. Why am I doing this? Because it’s a great way to learn from the most inspiring people in the field! There’s so much happening in the universe of RL, and there’s tons of interesting angles and so many fascinating minds to learn from. Now I know there is no shortage of books, papers, and lectures, but so much goes unsaid. I mean I guess if you work at MILA or AMII or Vector Institute, you might be having these conversations over coffee all the time, but I live in a little village in the woods in BC, so for me, these remote interviews are like a great way to have these conversations, and I hope sharing with the community makes it more worthwhile for everyone. In terms of format, the first 2 episodes were interviews in longer form, around an hour long. Going forward, some may be a lot shorter, it depends on the guest. If you want want to be a guest or suggest a guest, goto talkrl.com/about, you will find a link to a suggestion form. Thanks for listening!
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.