
Max Schwarzer
TalkRL: The Reinforcement Learning Podcast
Chatshupiti and RL: A Key Component
RL was a key component to Chatshupiti itself and led to a lot higher performance than without RL. Anthropic does know this and just it hasn't really been released in detail. I do think RL is extremely valuable the moment you start to have interaction to use, for example. It seems like a great example of a case where RL would be good.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.