
 LessWrong (Curated & Popular)
 LessWrong (Curated & Popular) “Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study” by Adam Karvonen
 13 snips 
 Apr 16, 2025  Adam Karvonen, an AI researcher with a hands-on background in robotics and manufacturing, discusses the critical failures of advanced AI models in basic physical tasks. He examines how even top models struggle with visual perception and physical reasoning in manufacturing, despite some improvements. Karvonen highlights the implications of uneven automation on the job market, suggesting that while white-collar roles may evolve, blue-collar workers could face significant challenges as AI technology outpaces their tasks. 
 AI Snips 
 Chapters 
 Transcript 
 Episode notes 
Visualizing a Part
- Adam Karvonen tested frontier AI models on a manufacturing task.
- Older models consistently missed obvious features or hallucinated non-existent ones.
Benchmark vs. Reality
- Many AI models score high on visual reasoning benchmarks like MMMU.
- However, real-world visual tasks, like manufacturing, remain challenging.
Gemini's Progress
- Gemini 2.5 showed significant progress, identifying major features in roughly 25% of attempts.
- It still misses subtle details and occasionally hallucinates or misinterprets features.
