
75 - Reinforcement / Imitation Learning in NLP, with Hal Daumé III
NLP Highlights
00:00
The Importance of Structured Prediction in Reinforcement Learning
There's been a handful of papers. The first one I know of that came out was by Stefan Riesler looking at quote unquote, bandit structured prediction. This is basically the setting where you have a structured prediction task but all you get to know at the end is whether you did a good job or not. So this has a flavor of reinforcement learning in the sense that you get this external reward. It also has this YOLO flavor where I don't get to show a user 25 different translations and ask them to like score each one. But we do still have the advantage that we have in structured prediction that we can do all sorts of computation offline because we know how the
Transcript
Play full episode