
NeurIPS 2024 - Posters and Hallways 1
TalkRL: The Reinforcement Learning Podcast
Exploring Trust in PPO and Time Constraints in Robust MDPs
This chapter explores the challenges of representation trust in reinforcement learning, particularly through the lens of Proximal Policy Optimization (PPO) and its potential performance decline due to non-stationarity. It introduces the PFO method for improving representation stability and discusses Time Constraint Robust MDPs to overcome conservativeness while enhancing average performance.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.