
75 - Reinforcement / Imitation Learning in NLP, with Hal Daumé III
NLP Highlights
00:00
The Importance of Interpolating Between Expert and Learned Policy
In theory, all theory says you should just never roll in with the expert at all. You should just always be rolling in with the policy. In practice, no one does that because you end up wasting a lot of time exploring parts of like when the learned policy is only seen like zero or 10 examples. So I mean, my own experience is I pretty much always just use.99 to the example number, probability of rolling into with the expert.
Transcript
Play full episode