TalkRL: The Reinforcement Learning Podcast cover image

Max Schwarzer

TalkRL: The Reinforcement Learning Podcast

00:00

Chatshupiti and RL: A Key Component

RL was a key component to Chatshupiti itself and led to a lot higher performance than without RL. Anthropic does know this and just it hasn't really been released in detail. I do think RL is extremely valuable the moment you start to have interaction to use, for example. It seems like a great example of a case where RL would be good.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app