Chatshupiti and RL: A Key Component

RL was a key component to Chatshupiti itself and led to a lot higher performance than without RL. Anthropic does know this and just it hasn't really been released in detail. I do think RL is extremely valuable the moment you start to have interaction to use, for example. It seems like a great example of a case where RL would be good.

Play episode from 01:07:06

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app