The Magic Behind Language Model Pre Training

I've been quite excited about fushot learning and different places you can apply that. How is it that despite many tasks not looking like language model prediction, you can still get so much generalization from that task? And also the well known gpt three results with the incontext learning. Just understanding where the model gets its bias for pattern continuation and repetition,. That's been on my mind, learning more about causal inference.

Play episode from 13:23

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app