TalkRL: The Reinforcement Learning Podcast cover image

Max Schwarzer

TalkRL: The Reinforcement Learning Podcast

00:00

How to Scale Your Replay Ratio Barrier

As you train, especially your critic, your value network for longer, it just starts. This is what people call plasticity loss. The network really loses its ability to adapt to new objectives. So we're getting better exploration because we're trying some new exploration after not getting burn in or what's happening. We thought that might have been the cause initially, but we were able to rule that out for the most part later on. And then, yeah, we had another paper called Breaking the Replay Ratio Barrier that was at Eichler this year,. where we introduced the idea of scaling by resetting.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app