How Do We Evaluate Language Models?

I suppose that it's really only been the past decade or so where we've seen there be enough data and compute in order to allow researchers to leverage. I suppose a big aspect of it is that now we can learn some empirical weight, I think, to some of these questions that maybe we couldn't before. To some extent, yes, although I'm not sure we have the right metrics for this empirical study. So when you look at language models and how do we evaluate them, how do we say that we're making progress? Well, there's a few different ways.

Play episode from 10:02

Transcript

Episode notes

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app