Can You Crate a Similator With Perfect Rules?

We're trying to apply that concept of optimizing for long term behavior. We found it useful to try to cast the entire mendation problem as a reinforcement learning problem. So while in the similator its the reward is literally how far you get down this track list, on the sort of mental level, i think long term retention is a good reward to look at. And what is interesting is that you really do see what you expect. It turns out that what you candof hope you see even if you optimize for lower engagement in one moment, you longer term retention can be more effective. From an unexpected point of vew I mean, you might expect your algaritm at any

Play episode from 35:17

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app