TalkRL: The Reinforcement Learning Podcast cover image

Karol Hausman and Fei Xia

TalkRL: The Reinforcement Learning Podcast

00:00

Does the Inner Monologue Generalize to a New Task?

The clipord doesn't leverage like the rich knowledge presented in the large language models. In our work, the generalization mainly come from the language model. So i i mention scratch pad because there are some irrelevant work in the nalp community that can inspire the inner monologue. Like every time we decode, for example, about action, it is consuming all previous history steps as a prompt.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app