Is Chad GPT Really Going to Learn?

A language model is much better at learning about the world. And its outputs aren't quite as good as one would hope or rather as good as they could be, which is why a system like Chad GPT has an additional reinforcement learning training process. We call it reinforcement learning from human feedback. Right now these neural networks, even Chad GPT, makes things up from time to time and that's something that also greatly limits their usefulness. But I'm quite hopeful that by simply improving this subsequent reinforcement learning fromhuman feedback step, we could just teach it to not hallucinate.

Play episode from 18:22

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app