Deep Papers cover image

ChatGPT and InstructGPT: Aligning Language Models to Human Intention

Deep Papers

CHAPTER

Are There Other Major Benefits of LHF?

The 1 billion parameter model that was trained with our LHF was performed on human evaluations roughly similarly or something like that, a little bit better than the 175 billion kind of vanilla GPT three. We often decompose that into some sub dimensions like helpfulness, harmfulness and honesty. And in fact, at least in this first paper, doesn't, most of the benefits come from improvements in helpfulness and honesty rather than harmlessness. Ryan, you want to talk about distillation a bit?

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner