Generally Intelligent cover image

Episode 25: Nicklas Hansen, UCSD, on long-horizon planning and why algorithms don't drive research progress

Generally Intelligent

00:00

How Many Samples Do You Need With Rewards?

The amount of supervision you get from a reward signal, which is like a skater versus a complexity of images and dynamics if you're learning dynamics model, it's just much greater in the case of self-solution. Yeah. It's like the yarmulcun take analogy. The reward signal is just the cherry. If you do a reward based fine tuning versus self supervision that was like one episode. Wow. Whoa.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app