LessWrong (Curated & Popular)

“Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study” by Adam Karvonen

13 snips
Apr 16, 2025
Adam Karvonen, an AI researcher with a hands-on background in robotics and manufacturing, discusses the critical failures of advanced AI models in basic physical tasks. He examines how even top models struggle with visual perception and physical reasoning in manufacturing, despite some improvements. Karvonen highlights the implications of uneven automation on the job market, suggesting that while white-collar roles may evolve, blue-collar workers could face significant challenges as AI technology outpaces their tasks.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Visualizing a Part

  • Adam Karvonen tested frontier AI models on a manufacturing task.
  • Older models consistently missed obvious features or hallucinated non-existent ones.
INSIGHT

Benchmark vs. Reality

  • Many AI models score high on visual reasoning benchmarks like MMMU.
  • However, real-world visual tasks, like manufacturing, remain challenging.
ANECDOTE

Gemini's Progress

  • Gemini 2.5 showed significant progress, identifying major features in roughly 25% of attempts.
  • It still misses subtle details and occasionally hallucinates or misinterprets features.
Get the Snipd Podcast app to discover more snips from this episode
Get the app