What Language Models Can Learn From the Internet?

The model is basically an ensemble of all these people who wrote stuff on the internet. When you feed it a prompt, what it's doing internally has to be something like figuring out who wrote the first and then trying to continue in that style. It forces the model to determine if things are true or not. I think for RL fine tuning, there's a lot more potential for the model to output something truthful as opposed to trying to imitate a certain style.

Play episode from 22:14

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app