

75 - Reinforcement / Imitation Learning in NLP, with Hal Daumé III
7 snips Nov 21, 2018
AI Snips
Chapters
Transcript
Episode notes
Structured Prediction as Sequential Decision Making
- Sequential decision-making models simplify structured prediction tasks like machine translation and POS tagging.
- Viewing outputs as sequential decisions facilitates training through methods like maximum likelihood or reinforcement learning.
Reinforcement Learning in Semantic Parsing
- Semantic parsing, training with question-answer pairs, benefits from reinforcement learning due to its delayed reward structure.
- Structured prediction tasks offer advantages like deterministic environments and the ability to explore multiple output options (n-best lists).
Imitation Learning vs. Reinforcement Learning
- Imitation learning leverages expert demonstrations to solve sequential decision-making problems, unlike reinforcement learning's reliance on rewards.
- Two types of imitation learning include learning from static demonstrations and interactive expert querying.