TalkRL: The Reinforcement Learning Podcast

Robin Ranjit Singh Chauhan
undefined
5 snips
Oct 18, 2022 • 44min

John Schulman

John Schulman is a cofounder of OpenAI, and currently a researcher and engineer at OpenAI.Featured ReferencesWebGPT: Browser-assisted question-answering with human feedbackReiichiro Nakano, Jacob Hilton, Suchir Balaji, Jeff Wu, Long Ouyang, Christina Kim, Christopher Hesse, Shantanu Jain, Vineet Kosaraju, William Saunders, Xu Jiang, Karl Cobbe, Tyna Eloundou, Gretchen Krueger, Kevin Button, Matthew Knight, Benjamin Chess, John SchulmanTraining language models to follow instructions with human feedbackLong Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan LoweAdditional ReferencesOur approach to alignment research, OpenAI 2022Training Verifiers to Solve Math Word Problems, Cobbe et al 2021UC Berkeley Deep RL Bootcamp Lecture 6: Nuts and Bolts of Deep RL Experimentation, John Schulman 2017Proximal Policy Optimization Algorithms, Schulman 2017Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs, Schulman 2016
undefined
Aug 19, 2022 • 35min

Sven Mika

Sven Mika is the Reinforcement Learning Team Lead at Anyscale, and lead committer of RLlib. He holds a PhD in biomathematics, bioinformatics, and computational biology from Witten/Herdecke University. Featured ReferencesRLlib Documentation: RLlib: Industry-Grade Reinforcement LearningRay: DocumentationRLlib: Abstractions for Distributed Reinforcement LearningEric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael I. Jordan, Ion StoicaEpisode sponsor: AnyscaleRay Summit 2022 is coming to San Francisco on August 23-24.Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
undefined
4 snips
Aug 16, 2022 • 1h 3min

Karol Hausman and Fei Xia

Karol Hausman is a Senior Research Scientist at Google Brain and an Adjunct Professor at Stanford working on robotics and machine learning. Karol is interested in enabling robots to acquire general-purpose skills with minimal supervision in real-world environments. Fei Xia is a Research Scientist with Google Research. Fei Xia is mostly interested in robot learning in complex and unstructured environments. Previously he has been approaching this problem by learning in realistic and scalable simulation environments (GibsonEnv, iGibson). Most recently, he has been exploring using foundation models for those challenges.Featured ReferencesDo As I Can, Not As I Say: Grounding Language in Robotic Affordances [ website ] Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan YanInner Monologue: Embodied Reasoning through Planning with Language ModelsWenlong Huang, Fei Xia, Ted Xiao, Harris Chan, Jacky Liang, Pete Florence, Andy Zeng, Jonathan Tompson, Igor Mordatch, Yevgen Chebotar, Pierre Sermanet, Noah Brown, Tomas Jackson, Linda Luu, Sergey Levine, Karol Hausman, Brian IchterAdditional ReferencesLarge-scale simulation for embodied perception and robot learning, Xia 2021QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation, Kalashnikov et al 2018MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale, Kalashnikov et al 2021ReLMoGen: Leveraging Motion Generation in Reinforcement Learning for Mobile Manipulation, Xia et al 2020Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills, Chebotar et al 2021  Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language, Zeng et al 2022Episode sponsor: AnyscaleRay Summit 2022 is coming to San Francisco on August 23-24.Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
undefined
Aug 1, 2022 • 1h 8min

Sai Krishna Gottipati

Saikrishna Gottipati is an RL Researcher at AI Redefined, working on RL, MARL, human in the loop learning.Featured ReferencesCogment: Open Source Framework For Distributed Multi-actor Training, Deployment & OperationsAI Redefined, Sai Krishna Gottipati, Sagar Kurandwad, Clodéric Mars, Gregory Szriftgiser, François ChabotDo As You Teach: A Multi-Teacher Approach to Self-Play in Deep Reinforcement LearningCurrently under reviewLearning to navigate the synthetically accessible chemical space using reinforcement learningSai Krishna Gottipati, Boris Sattarov, Sufeng Niu, Yashaswi Pathak, Haoran Wei, Shengchao Liu, Karam J. Thomas, Simon Blackburn, Connor W. Coley, Jian Tang, Sarath Chandar, Yoshua BengioAdditional ReferencesAsymmetric self-play for automatic goal discovery in robotic manipulation, 2021 OpenAI et al Continuous Coordination As a Realistic Scenario for Lifelong Learning, 2021 Nekoei et alEpisode sponsor: AnyscaleRay Summit 2022 is coming to San Francisco on August 23-24.Hear how teams at Dow, Verizon, Riot Games, and more are solving their RL challenges with Ray's RLlib.Register at raysummit.org and use code RAYSUMMIT22RL for a further 25% off the already reduced prices.
undefined
May 9, 2022 • 59min

Aravind Srinivas 2

Aravind Srinivas is back!  He is now a research Scientist at OpenAI.Featured ReferencesDecision Transformer: Reinforcement Learning via Sequence ModelingLili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor MordatchVideoGPT: Video Generation using VQ-VAE and TransformersWilson Yan, Yunzhi Zhang, Pieter Abbeel, Aravind Srinivas
undefined
Apr 12, 2022 • 1h 37min

Rohin Shah

Dr. Rohin Shah is a Research Scientist at DeepMind, and the editor and main contributor of the Alignment Newsletter.Featured ReferencesThe MineRL BASALT Competition on Learning from Human FeedbackRohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca DraganPreferences Implicit in the State of the WorldRohin Shah, Dmitrii Krasheninnikov, Jordan Alexander, Pieter Abbeel, Anca DraganBenefits of Assistance over Reward Learning Rohin Shah, Pedro Freire, Neel Alex, Rachel Freedman, Dmitrii Krasheninnikov, Lawrence Chan, Michael D Dennis, Pieter Abbeel, Anca Dragan, Stuart RussellOn the Utility of Learning about Humans for Human-AI CoordinationMicah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca DraganEvaluating the Robustness of Collaborative AgentsPaul Knott, Micah Carroll, Sam Devlin, Kamil Ciosek, Katja Hofmann, A. D. Dragan, Rohin ShahAdditional ReferencesAGI Safety Fundamentals, EA Cambridge
undefined
Feb 22, 2022 • 1h 4min

Jordan Terry

Jordan Terry is a PhD candidate at University of Maryland, the maintainer of Gym, the maintainer and creator of PettingZoo and the founder of Swarm Labs.Featured ReferencesPettingZoo: Gym for Multi-Agent Reinforcement LearningJ. K. Terry, Benjamin Black, Nathaniel Grammel, Mario Jayakumar, Ananth Hari, Ryan Sullivan, Luis Santos, Rodrigo Perez, Caroline Horsch, Clemens Dieffendahl, Niall L. Williams, Yashas Lokesh, Praveen RaviPettingZoo on Githubgym on GithubAdditional ReferencesTime Limits in Reinforcement Learning, Pardo et al 2017Deep Reinforcement Learning at the Edge of the Statistical Precipice, Agarwal et al 2021
undefined
44 snips
Dec 20, 2021 • 1h 11min

Robert Lange

Robert Tjarko Lange, a PhD student at TU Berlin, discusses topics like meta reinforcement learning, hard-coded behaviors in animals, lottery ticket hypothesis and pruning masks in deep RL, semantic RL with action grammars, advances in meta RL, the need for scientific governance, and exploring the role of parameterization in RL.
undefined
Nov 18, 2021 • 24min

NeurIPS 2021 Political Economy of Reinforcement Learning Systems (PERLS) Workshop

We hear about the idea of PERLS and why its important to talk about.Political Economy of Reinforcement Learning (PERLS) Workshop at NeurIPS 2021 on Tues Dec 14th NeurIPS 2021
undefined
Sep 27, 2021 • 1h 10min

Amy Zhang

Amy Zhang is a postdoctoral scholar at UC Berkeley and a research scientist at Facebook AI Research. She will be starting as an assistant professor at UT Austin in Spring 2023. Featured References Invariant Causal Prediction for Block MDPs Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup Multi-Task Reinforcement Learning with Context-based Representations Shagun Sodhani, Amy Zhang, Joelle Pineau MBRL-Lib: A Modular Library for Model-based Reinforcement Learning Luis Pineda, Brandon Amos, Amy Zhang, Nathan O. Lambert, Roberto Calandra Additional References Amy Zhang - Exploring Context for Better Generalization in Reinforcement Learning @ UCL DARK ICML 2020 Poster session: Invariant Causal Prediction for Block MDPs Clare Lyle - Invariant Prediction for Generalization in Reinforcement Learning @ Simons Institute 

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app