TalkRL: The Reinforcement Learning Podcast cover image

NeurIPS 2024 - Posters and Hallways 1

TalkRL: The Reinforcement Learning Podcast

CHAPTER

Exploring Trust in PPO and Time Constraints in Robust MDPs

This chapter explores the challenges of representation trust in reinforcement learning, particularly through the lens of Proximal Policy Optimization (PPO) and its potential performance decline due to non-stationarity. It introduces the PFO method for improving representation stability and discusses Time Constraint Robust MDPs to overcome conservativeness while enhancing average performance.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner