
NeurIPS 2024 - Posters and Hallways 1
TalkRL: The Reinforcement Learning Podcast
00:00
Exploring Trust in PPO and Time Constraints in Robust MDPs
This chapter explores the challenges of representation trust in reinforcement learning, particularly through the lens of Proximal Policy Optimization (PPO) and its potential performance decline due to non-stationarity. It introduces the PFO method for improving representation stability and discusses Time Constraint Robust MDPs to overcome conservativeness while enhancing average performance.
Transcript
Play full episode