
Deep Reinforcement Learning at the Edge of the Statistical Precipice with Rishabh Agarwal - #559
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
00:00
Challenges in Reinforcement Learning Evaluation
This chapter examines the obstacles researchers face in implementing and comparing reinforcement learning methods, emphasizing issues around resource limitations and the need for open-source code. It discusses the tension between comprehensive evaluations and practical constraints, advocating for better uncertainty reporting and alternative statistical metrics. The chapter also critiques traditional practices like fixed random seeds, proposing more effective approaches for assessing and communicating algorithm performance.
Transcript
Play full episode