TalkRL: The Reinforcement Learning Podcast cover image

John Schulman

TalkRL: The Reinforcement Learning Podcast

CHAPTER

Can Language Models Tell Us When They Don't Know What They Know?

Can these types of agents tell us when they don't know something or is that a hard problem? I'd say sort of if you ask a question that's kind of in the core of, uh, the models knowledge, it will know. The training objective strongly incentivizes the model to be calibrated - meaning it has a reasonable estimate of it. It doesn't feel like a insurmountable problem, but there's some practical difficulties to gettingThere.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner