Generally Intelligent cover image

Episode 25: Nicklas Hansen, UCSD, on long-horizon planning and why algorithms don't drive research progress

Generally Intelligent

CHAPTER

TD Learning - Data Augmentation to Auxiliary Objective

At the time, at least for the visual generalization, it seemed that data augmentation was a good solution and we just needed to figure out a way to incorporate that into the framework. We were also working with TD learning based algorithms that would learn a value function. Usually you're bootstrapping with a target Q function, which is some kind of moving average of your learned Q function. And it turns out that if you add data augmentation and your network is not fully fitting the data, adding this data augmentation will increase the variance, which in turn increases the variance of the Q target as well. Right, I see. It's a little interesting. Yeah, so you can

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner