How to Generate a Change to a Model Using OpenAI

The GPT network is trained by human trainers. They score responses and then they negatively reinforce the responses that correspond to undesired answers. Here's how this works visually. It's given who is the first person to walk on the moon. A bunch of activation levels percolate through the network. The output nodes produce Frank Zappa. And when a trainer comes along and says no, all of the connections that led to Frank Zappa are reduced.

Play episode from 55:52

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app