
Sai Krishna Gottipati
TalkRL: The Reinforcement Learning Podcast
00:00
Is Mith Missing in the Design?
Despite having all this trained policy and value network, if you don't allow it to explore for enough, there are still lot of blindands. So i think for any aches ingin it mostly boils down to how much competition, or how many monte carlo or three such simulations you're allowing the ingen to have. I've never been very clear on how perfect a fit the convolutional network really is for this problem. It seems to me it may be not perfect. Vird exactly. That's another very good question to explorea. Unlike other boat games like go a, chess has a very interesting representation as well. You can't just represent them as es on a
Play episode from 56:51
Transcript


