
Don't Worry About the Vase Podcast Give Me a Reason(ing Model)
Jun 10, 2025
This discussion critiques advancements in AI reasoning, revealing limitations in popular models like Claude and DeepSeek R1 that prioritize pattern memorization over true problem-solving. It challenges flawed research designs and evaluation metrics in the field. A case study on the Tower of Hanoi highlights AI's struggles with complex problems. The relationship between human and AI reasoning is explored, questioning the societal impact on employment and the importance of precise language in understanding intelligence.
AI Snips
Chapters
Transcript
Episode notes
Context Length Limits Reasoning
- The failure of LLMs on complex reasoning tasks often stems from inherent context length limitations rather than a lack of reasoning ability.
- Humans also struggle or need longer time on problems when context or memory is constrained similarly.
Evaluation Flaws and Token Limits
- Evaluations that demand exhaustive output sequences unfairly penalize models that use abstraction or recursive reasoning.
- Token budget constraints guarantee failure on exponential tasks like Tower of Hanoi at higher complexities.
Models' Early Give-Up Behavior
- LLMs prioritize efficiency by giving up early on problems too complex for their token budget.
- This avoidance is a learned heuristic, not a sign of genuine inability to reason.
