TalkRL: The Reinforcement Learning Podcast cover image

Karol Hausman and Fei Xia

TalkRL: The Reinforcement Learning Podcast

00:00

How to Scale a Language Model to a Larger Scale

Sakan was using a language model called flan, which has abouta 137 billion per meter model. When we switch it to a larger language model, which is passway language model, then it a lot of the planning mistakes that we are seeing at a smaller scale. It can say, i don't like coke, so bring something else. Sprite is not cok, so i will bring sprite instead. And then it will generate plan to bring a sprite instead. So this type of emergent capability is super interesting for us to see and supera exciting for us, and surprises us a lot.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app