LessWrong (Curated & Popular)

“About 30% of Humanity’s Last Exam chemistry/biology answers are likely wrong” by bohaska

Jul 30, 2025
A recent analysis revealed that nearly 30% of the chemistry and biology answers in a prominent academic exam are likely wrong, raising alarms about scientific assessment integrity. The podcast discusses innovative efforts to create a validated set of questions using both AI and human expertise. Intriguing research on the rare noble gas oganesson showcases its unique properties, while a surprising look into snakeflies uncovers their nectar-eating habits, challenging long-held beliefs in entomology. Accuracy in science has never felt more critical!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

High Error Rate in HLE Answers

  • About 30% of chemistry and biology questions in Humanity's Last Exam conflict with peer-reviewed literature.
  • This indicates significant inaccuracies exist in what is considered PhD-level evaluation material.
INSIGHT

Limited Review Time Affected Accuracy

  • Reviewers were only required to quickly assess questions without verifying full correctness.
  • This likely contributed to inaccuracies since deep validation was not mandatory.
ANECDOTE

Oganesson Misclassified in Question

  • A question incorrectly labeled oganesson as Earth's rarest noble gas, which is a trivia-level error.
  • Only a few atoms exist; it's predicted solid, reactive, and not part of terrestrial matter.
Get the Snipd Podcast app to discover more snips from this episode
Get the app