The Thesis Review cover image

[08] He He - Sequential Decisions and Predictions in NLP

The Thesis Review

00:00

Offline Reinforcement Learning for Text Generation

In many NLP problems, we don't have this luxury. Once you go off the oracle path, you don't know what, what are the best actions to take. So that's a big limitation of this algorithm because it assumes that during training, whichever state you're in, you have access to the article. And then in that case, so you're kind of framing the text generation in terms of the sequential decision making for dagger, you can't necessarily have an oracle.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app