The Learning Potential Score for Revisions of Levels

In the original priotize levelry play, or p l r paper, we basically used the l one value, lst, as the score for priotizing the revisitation of each level. We found that when you actually use this l one value loss base priortization on openaproction, you end up doing worse than uniform sampling. But then if you add the staleness sampling, i'm going to sample some percentage of the time instead from a staleness distribution that samples by priotizing for the age of levels. Some in a sample level that hasn't been updated in a long time in terms of its score. Then if you do like 30 % staleness

Play episode from 14:22

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app