Justified Posteriors cover image

Evaluating GDPVal, OpenAI's Eval for Economic Value

Justified Posteriors

00:00

Can automated graders replace human evaluators?

Andrey discusses automated grading concordance (65%) versus inter-human agreement (70%) and potential limits.

Play episode from 40:20
Transcript

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app