
[08] He He - Sequential Decisions and Predictions in NLP
The Thesis Review
How to Combine Imitation Learning and Reinforcement Learning
How to combine imitation learning and reinforcement learning is also an interesting question. Could you blend the two somehow? So maybe in states that you don't have access to the article, could you use some kind of reward function? One thing that was really nice in your thesis is you had kind of an overview of these different things that have been developed like Dagger and then AgriVate and things like that. If you look back at this theory that's been developed, how has it sort of been useful or helpful to you? Is it just useful for understanding the problem better or does it actually relate to things that you see in practice?
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.