Crowdsourced AI benchmarks have serious flaws, some experts say
Apr 24, 2025
05:29
forum Ask episode
view_agenda Chapters
auto_awesome Transcript
info_circle Episode notes
AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the strengths and weaknesses of their latest models. But some experts say that there are serious problems with this approach from an ethical and academic perspective.