TalkRL: The Reinforcement Learning Podcast cover image

Max Schwarzer

TalkRL: The Reinforcement Learning Podcast

CHAPTER

Chatshupiti and RL: A Key Component

RL was a key component to Chatshupiti itself and led to a lot higher performance than without RL. Anthropic does know this and just it hasn't really been released in detail. I do think RL is extremely valuable the moment you start to have interaction to use, for example. It seems like a great example of a case where RL would be good.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner