The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Deep Reinforcement Learning at the Edge of the Statistical Precipice with Rishabh Agarwal - #559

Feb 14, 2022
Rishabh Agarwal, a research scientist at Google Brain in Montreal, dives into his award-winning paper on deep reinforcement learning. The discussion reveals how traditional performance evaluations can lead to misleading conclusions due to random seed variability. Rishabh highlights the challenges of current benchmarking methods, advocating for better reporting practices. With insights on the importance of uncertainty in results, he calls for a shift in academic standards to improve research integrity. Open-source tools aim to enhance evaluation methods, fostering greater transparency in the field.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Early RL Inspiration

  • Rishabh Agarwal's interest in RL began with DeepMind's Atari results, sparking a desire to automate game playing.
  • His bachelor's thesis explored learned agents in Scrabble, similar to AlphaGo but with imitation learning due to limited compute.
ANECDOTE

Joining Hinton's Team

  • Agarwal joined Jeff Hinton's team after trying different research areas.
  • Hinton valued Agarwal's research experience and understanding of research challenges, including failures.
INSIGHT

Building vs. Starting Fresh

  • Real-world problems often involve building upon existing solutions rather than starting from scratch.
  • Research, however, frequently defaults to tabula rasa, neglecting the value of incremental improvement.
Get the Snipd Podcast app to discover more snips from this episode
Get the app