
Episode 25: Nicklas Hansen, UCSD, on long-horizon planning and why algorithms don't drive research progress
Generally Intelligent
00:00
TD Learning - Data Augmentation to Auxiliary Objective
At the time, at least for the visual generalization, it seemed that data augmentation was a good solution and we just needed to figure out a way to incorporate that into the framework. We were also working with TD learning based algorithms that would learn a value function. Usually you're bootstrapping with a target Q function, which is some kind of moving average of your learned Q function. And it turns out that if you add data augmentation and your network is not fully fitting the data, adding this data augmentation will increase the variance, which in turn increases the variance of the Q target as well. Right, I see. It's a little interesting. Yeah, so you can
Transcript
Play full episode