
#54 – Edouard Mathieu on Our World in Data
Hear This Idea
00:00
Is There a Benchmark for Image Net for Classifying Images?
The touring test in some sense is a kind of informal benchmark for like conversational ability definitely try coming up with new but there's a benchmark for like truthfulness truthful QA maybe come up with new benchmarks which can scale to like really advanced systems. I don't really have a good picture for like progress over time on any of these things yeah and I think I think something that would be a combined benchmark of what we define to be AGI would be good.
Transcript
Play full episode