Generally Intelligent cover image

Episode 29: Jim Fan, NVIDIA, on foundation models for embodied agents, scaling data, and why prompt engineering will become irrelevant

Generally Intelligent

00:00

How to Turn a Clip Model Into a Reward Function

Clip is an image and text model trained on a huge kind of internet skill database. It learns associations between these two data modalities through contrastive learning. Here in my club, we repurpose this idea to make it into a reward function. We can now shape the agent towards the behavior of harvesting wool from sheep given this natural text.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app