The Importance of Interpolating Between Expert and Learned Policy

In theory, all theory says you should just never roll in with the expert at all. You should just always be rolling in with the policy. In practice, no one does that because you end up wasting a lot of time exploring parts of like when the learned policy is only seen like zero or 10 examples. So I mean, my own experience is I pretty much always just use.99 to the example number, probability of rolling into with the expert.

Play episode from 17:39

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app