
From science fiction to reality: Exploring the potential of LLMs
The Ticket: Discover the Future of Customer Service, Support, and Experience, with Intercom
00:00
The Human Refinement Learning Paradigm
The big break to our advance in the last few years is this sort of like instruct GPT thing. You train your model on all the data on the internet, and then you come up with something that's not aligned. And so when you go through a fine tuning or an alignment or an instruction tuning phase or whatever you want to call it, where like you give it lots of examples of like good behavior and bad behavior,. Then you like adjust the model weight. This is the human reinforcement learning.
Transcript
Play full episode