
[08] He He - Sequential Decisions and Predictions in NLP
The Thesis Review
00:00
How to Combine Imitation Learning and Reinforcement Learning
How to combine imitation learning and reinforcement learning is also an interesting question. Could you blend the two somehow? So maybe in states that you don't have access to the article, could you use some kind of reward function? One thing that was really nice in your thesis is you had kind of an overview of these different things that have been developed like Dagger and then AgriVate and things like that. If you look back at this theory that's been developed, how has it sort of been useful or helpful to you? Is it just useful for understanding the problem better or does it actually relate to things that you see in practice?
Transcript
Play full episode