

“About 30% of Humanity’s Last Exam chemistry/biology answers are likely wrong” by bohaska
Jul 30, 2025
A recent analysis revealed that nearly 30% of the chemistry and biology answers in a prominent academic exam are likely wrong, raising alarms about scientific assessment integrity. The podcast discusses innovative efforts to create a validated set of questions using both AI and human expertise. Intriguing research on the rare noble gas oganesson showcases its unique properties, while a surprising look into snakeflies uncovers their nectar-eating habits, challenging long-held beliefs in entomology. Accuracy in science has never felt more critical!
AI Snips
Chapters
Transcript
Episode notes
High Error Rate in HLE Answers
- About 30% of chemistry and biology questions in Humanity's Last Exam conflict with peer-reviewed literature.
- This indicates significant inaccuracies exist in what is considered PhD-level evaluation material.
Limited Review Time Affected Accuracy
- Reviewers were only required to quickly assess questions without verifying full correctness.
- This likely contributed to inaccuracies since deep validation was not mandatory.
Oganesson Misclassified in Question
- A question incorrectly labeled oganesson as Earth's rarest noble gas, which is a trivia-level error.
- Only a few atoms exist; it's predicted solid, reactive, and not part of terrestrial matter.