TD Learning - Data Augmentation to Auxiliary Objective

At the time, at least for the visual generalization, it seemed that data augmentation was a good solution and we just needed to figure out a way to incorporate that into the framework. We were also working with TD learning based algorithms that would learn a value function. Usually you're bootstrapping with a target Q function, which is some kind of moving average of your learned Q function. And it turns out that if you add data augmentation and your network is not fully fitting the data, adding this data augmentation will increase the variance, which in turn increases the variance of the Q target as well. Right, I see. It's a little interesting. Yeah, so you can

Transcript

Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app