TalkRL: The Reinforcement Learning Podcast

chevron_right

Natasha Jaques

whatshot 11 snips

Aug 9, 2019

50:24

forum

Ask episode

web_stories

AI Snips

view_agenda

Chapters

auto_awesome

Transcript

info_circle

Episode notes

insights

INSIGHT

Deep RL Generalization Challenges

Generalization and robustness are major challenges in deep RL.
Small input shifts can cause policies to fail, limiting real-world deployment.

question_answer

ANECDOTE

Inspiration from PhD Exam Pressure

Natasha Jaques struggled with a hard question from her PhD committee about social and emotional intrinsic motivations for agents.
She spent nearly 24 hours crafting the idea of rewarding agents for having causal influence on others' actions.

volunteer_activism

ADVICE

Dig Into Model Behaviors

Deeply analyze and visualize model behaviors beyond charts.
Audit RL policies carefully to uncover unexpected emergent strategies.

Get the Snipd Podcast app to discover more snips from this episode

The Challenges of Machine Learning and Deep Learning

01:38 • 2min

chevron_right

What's the Difference Between DeepMind and Google Brain?

03:30 • 2min

chevron_right

Working on Tackling Climate Change With Machine Learning

The Future of Machine Learning and Climate Action

11:32 • 3min

chevron_right

Multi-Agent RL - What Is a Sequential Social Dilemma?

14:37 • 4min

chevron_right

Learning to Communicate Is a Way to Learn to Influence Others

18:57 • 3min

chevron_right

Do You Know When the Influence Reward Is Happening?

22:06 • 2min

chevron_right

Reproducing SSDs in Open Source

24:19 • 2min

chevron_right

Is Inequity Aversion More Effective Than Inequities Aversion?

26:24 • 3min

chevron_right

Is the Tesla Car a Shared Reward?

29:18 • 2min

chevron_right

AI and Climate Change

Influence Mechanism to Prevent Collapse in Hierarchical RL

37:30 • 2min

chevron_right

The Social Function of Intelligence by Libbo & Co

39:19 • 3min

chevron_right

Is the Autocurricular a Theory or an Idea?

42:38 • 2min

chevron_right

Do You Feel Like Your Influence Mechanism Would Be Useful for Autocurricular?

44:33 • 2min

chevron_right

What Mentors Do You Look Up to in the Research World?

46:23 • 2min

chevron_right

Do You Have Any Plans to Take a Break?

48:36 • 2min

chevron_right

Natasha Jaques is a PhD candidate at MIT working on affective and social intelligence. She has interned with DeepMind and Google Brain, and was an OpenAI Scholars mentor. Her paper “Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning” received an honourable mention for best paper at ICML 2019.

Featured References

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

Tackling climate change with Machine Learning
David Rolnick, Priya L. Donti, Lynn H. Kaack, Kelly Kochanski, Alexandre Lacoste, Kris Sankaran, Andrew Slavin Ross, Nikola Milojevic-Dupont, Natasha Jaques, Anna Waldman-Brown, Alexandra Luccioni, Tegan Maharaj, Evan D. Sherwin, S. Karthik Mukkavilli, Konrad P. Kording, Carla Gomes, Andrew Y. Ng, Demis Hassabis, John C. Platt, Felix Creutzig, Jennifer Chayes, Yoshua Bengio

Additional References

MIT Media Lab Flight Offsets, Caroline Jaffe, Juliana Cherston, Natasha Jaques
Modeling Others using Oneself in Multi-Agent Reinforcement Learning,
Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus
Inequity aversion improves cooperation in intertemporal social dilemmas,
Edward Hughes, Joel Z. Leibo, Matthew G. Phillips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, Thore Graepel
Sequential Social Dilemma Games on github, Eugene Vinitsky, Natasha Jaques
AI Alignment newsletter, Rohin Shah
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions, Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley
The social function of intellect, Nicholas Humphrey
Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research, Joel Z. Leibo, Edward Hughes, Marc Lanctot, Thore Graepel
A Recipe for Training Neural Networks, Andrej Karpathy
Emotionally Adaptive Intelligent Tutoring Systems using POMDPs, Natasha Jaques
Sapiens, Yuval Noah Harari

Home Top podcasts Popular guests Top books