Yannic Kilcher Videos (Audio Only) cover image

CICERO: An AI agent that negotiates, persuades, and cooperates with people

Yannic Kilcher Videos (Audio Only)

00:00

The Language Model Is a Language Model That Works Like Humans

The anchor policies, those are dialogue conditional. They always mix the anchor policy with the reinforcement learned or with the computed policy in order to get a model that performs both well and like humans. So from here, the dialogue comes into this model and then that information goes up here. But that's very, very indirect. It's essentially the only information that the planning has about the action is what would a human do in this situationGiven this board and this dialogue, right? That's the only information you have about the dialogue. You don't have the input dialogue directly and your actions aren't including what dialogue you're going to send. Here is the only at the output of this planning module

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app