Deep Papers cover image

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

Deep Papers

CHAPTER

The Importance of General Benchmarking of Language Models

Jasper: I think one thing that was kind of also interesting from this section and maybe it's a little bit negative is they kind of critique some of these other small language models. And if you end up kind of extending the benchmarks, or you include more complicated reasoning tasks in the benchmarks, actually chat GPT ends up performing better than these smaller models. Jasper: How do you benchmark languages for copywriting like what are the data sets for that? It's just very hard. This is obviously true at Harvey with, you know, we're trying to benchmark like language models for legal performance both with specific legal domains but then legal in general as well.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner