
Episode 25: Nicklas Hansen, UCSD, on long-horizon planning and why algorithms don't drive research progress
Generally Intelligent
TD Learning - Data Augmentation to Auxiliary Objective
At the time, at least for the visual generalization, it seemed that data augmentation was a good solution and we just needed to figure out a way to incorporate that into the framework. We were also working with TD learning based algorithms that would learn a value function. Usually you're bootstrapping with a target Q function, which is some kind of moving average of your learned Q function. And it turns out that if you add data augmentation and your network is not fully fitting the data, adding this data augmentation will increase the variance, which in turn increases the variance of the Q target as well. Right, I see. It's a little interesting. Yeah, so you can
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.