75 - Reinforcement / Imitation Learning in NLP, with Hal Daumé III

7 snips

Nov 21, 2018

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Structured Prediction as Sequential Decision Making

Sequential decision-making models simplify structured prediction tasks like machine translation and POS tagging.
Viewing outputs as sequential decisions facilitates training through methods like maximum likelihood or reinforcement learning.

INSIGHT

Reinforcement Learning in Semantic Parsing

Semantic parsing, training with question-answer pairs, benefits from reinforcement learning due to its delayed reward structure.
Structured prediction tasks offer advantages like deterministic environments and the ability to explore multiple output options (n-best lists).

INSIGHT

Imitation Learning vs. Reinforcement Learning

Imitation learning leverages expert demonstrations to solve sequential decision-making problems, unlike reinforcement learning's reliance on rewards.
Two types of imitation learning include learning from static demonstrations and interactive expert querying.

Get the Snipd Podcast app to discover more snips from this episode