Generative assessment project: Comparing language models on hallucinations and answering behavior

OpenAI unveiled the generative assessment project, a research initiative to explore strengths and weaknesses of language models. They compare how different models hallucinate answers and hedge instead of giving a direct answer. It's interesting because it's a unique comparison that hasn't been done before, especially with proprietary models.

Play episode from 57:17

Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app