How to Scale Your Replay Ratio Barrier

As you train, especially your critic, your value network for longer, it just starts. This is what people call plasticity loss. The network really loses its ability to adapt to new objectives. So we're getting better exploration because we're trying some new exploration after not getting burn in or what's happening. We thought that might have been the cause initially, but we were able to rule that out for the most part later on. And then, yeah, we had another paper called Breaking the Replay Ratio Barrier that was at Eichler this year,. where we introduced the idea of scaling by resetting.

Play episode from 12:47

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app