The Ticket: Discover the Future of Customer Service, Support, and Experience, with Intercom cover image

From science fiction to reality: Exploring the potential of LLMs

The Ticket: Discover the Future of Customer Service, Support, and Experience, with Intercom

00:00

The Human Refinement Learning Paradigm

The big break to our advance in the last few years is this sort of like instruct GPT thing. You train your model on all the data on the internet, and then you come up with something that's not aligned. And so when you go through a fine tuning or an alignment or an instruction tuning phase or whatever you want to call it, where like you give it lots of examples of like good behavior and bad behavior,. Then you like adjust the model weight. This is the human reinforcement learning.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app