
Max Schwarzer
TalkRL: The Reinforcement Learning Podcast
How to Scale Your Replay Ratio Barrier
As you train, especially your critic, your value network for longer, it just starts. This is what people call plasticity loss. The network really loses its ability to adapt to new objectives. So we're getting better exploration because we're trying some new exploration after not getting burn in or what's happening. We thought that might have been the cause initially, but we were able to rule that out for the most part later on. And then, yeah, we had another paper called Breaking the Replay Ratio Barrier that was at Eichler this year,. where we introduced the idea of scaling by resetting.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.