Generally Intelligent cover image

Episode 25: Nicklas Hansen, UCSD, on long-horizon planning and why algorithms don't drive research progress

Generally Intelligent

CHAPTER

How Many Samples Do You Need With Rewards?

The amount of supervision you get from a reward signal, which is like a skater versus a complexity of images and dynamics if you're learning dynamics model, it's just much greater in the case of self-solution. Yeah. It's like the yarmulcun take analogy. The reward signal is just the cherry. If you do a reward based fine tuning versus self supervision that was like one episode. Wow. Whoa.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner