There are two different kinds of prompts that we give to our AI systems. One is an open-ended question like what is the capital of Canada right now and then there's a clear answer to that second category. These systems these RLHF schemes they usually rely on having humans rank a bunch of different responses that a language model might give to a question like that. Some of those answers are going to be completely right and should get full reward and some of them are just going to be laughably wrong but it turns out that when you fail to account for that subtle difference your system ends up treating all of these inputs in the same way which leads to less effective models.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode