Generally Intelligent cover image

Episode 25: Nicklas Hansen, UCSD, on long-horizon planning and why algorithms don't drive research progress

Generally Intelligent

CHAPTER

Machine Learning

The goal is to adapt this pre-trained policy to new environments without getting any data from the target environment, because you might not have any if you have a robot. And also without access to reward signal. This would be at least the solution that doesn't require all of the engineering involved and sort of predicting where you will deploy your policy. That was the motivation. In RL, images of different time steps are very, very correlated with each other. So it's very reasonable that if the ball is in the top left corner at time t, then at time t plus one, it will follow b - still around there.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner