“Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study” by Adam Karvonen

13 snips

Apr 16, 2025

Adam Karvonen, an AI researcher with a hands-on background in robotics and manufacturing, discusses the critical failures of advanced AI models in basic physical tasks. He examines how even top models struggle with visual perception and physical reasoning in manufacturing, despite some improvements. Karvonen highlights the implications of uneven automation on the job market, suggesting that while white-collar roles may evolve, blue-collar workers could face significant challenges as AI technology outpaces their tasks.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Visualizing a Part

Adam Karvonen tested frontier AI models on a manufacturing task.
Older models consistently missed obvious features or hallucinated non-existent ones.

INSIGHT

Benchmark vs. Reality

Many AI models score high on visual reasoning benchmarks like MMMU.
However, real-world visual tasks, like manufacturing, remain challenging.

ANECDOTE

Gemini's Progress

Gemini 2.5 showed significant progress, identifying major features in roughly 25% of attempts.
It still misses subtle details and occasionally hallucinates or misinterprets features.

Get the Snipd Podcast app to discover more snips from this episode

Get the app