
ChatGPT and InstructGPT: Aligning Language Models to Human Intention
Deep Papers
Are There Other Major Benefits of LHF?
The 1 billion parameter model that was trained with our LHF was performed on human evaluations roughly similarly or something like that, a little bit better than the 175 billion kind of vanilla GPT three. We often decompose that into some sub dimensions like helpfulness, harmfulness and honesty. And in fact, at least in this first paper, doesn't, most of the benefits come from improvements in helpfulness and honesty rather than harmlessness. Ryan, you want to talk about distillation a bit?
00:00
Transcript
Play full episode
Remember Everything You Learn from Podcasts
Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.