
Aravind Srinivas 2
TalkRL: The Reinforcement Learning Podcast
00:00
Is It Important to Extrapolate Beyond the Training Data?
I've been thinking about this as supervised learning, and I don't understand how it could ever do better than the training data. But for what it's worth, I want to clarify that that is a nature of conditioning on the reward and not because we're using a transformer or anything. If the agent has understood what it even means to achieve a certain score, whether it be good or bad, you can ask it to get a higher score than whatever score it's seen in the trainingData. This is just, I just mean this in a funny way, but it's slightly conscious. It was more like a subconsciously, I remembered this, but it was like two years ago when
Transcript
Play full episode