

Deep Reinforcement Learning at the Edge of the Statistical Precipice with Rishabh Agarwal - #559
Feb 14, 2022
Rishabh Agarwal, a research scientist at Google Brain in Montreal, dives into his award-winning paper on deep reinforcement learning. The discussion reveals how traditional performance evaluations can lead to misleading conclusions due to random seed variability. Rishabh highlights the challenges of current benchmarking methods, advocating for better reporting practices. With insights on the importance of uncertainty in results, he calls for a shift in academic standards to improve research integrity. Open-source tools aim to enhance evaluation methods, fostering greater transparency in the field.
AI Snips
Chapters
Transcript
Episode notes
Early RL Inspiration
- Rishabh Agarwal's interest in RL began with DeepMind's Atari results, sparking a desire to automate game playing.
- His bachelor's thesis explored learned agents in Scrabble, similar to AlphaGo but with imitation learning due to limited compute.
Joining Hinton's Team
- Agarwal joined Jeff Hinton's team after trying different research areas.
- Hinton valued Agarwal's research experience and understanding of research challenges, including failures.
Building vs. Starting Fresh
- Real-world problems often involve building upon existing solutions rather than starting from scratch.
- Research, however, frequently defaults to tabula rasa, neglecting the value of incremental improvement.