
Max Schwarzer
TalkRL: The Reinforcement Learning Podcast
00:00
Chatshupiti and RL: A Key Component
RL was a key component to Chatshupiti itself and led to a lot higher performance than without RL. Anthropic does know this and just it hasn't really been released in detail. I do think RL is extremely valuable the moment you start to have interaction to use, for example. It seems like a great example of a case where RL would be good.
Transcript
Play full episode