
John Schulman
TalkRL: The Reinforcement Learning Podcast
Can Language Models Tell Us When They Don't Know What They Know?
Can these types of agents tell us when they don't know something or is that a hard problem? I'd say sort of if you ask a question that's kind of in the core of, uh, the models knowledge, it will know. The training objective strongly incentivizes the model to be calibrated - meaning it has a reasonable estimate of it. It doesn't feel like a insurmountable problem, but there's some practical difficulties to gettingThere.
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.