TechCrunch Startup News

OpenAI’s new reasoning AI models hallucinate more

Apr 22, 2025
OpenAI has launched new AI models, o3 and o4-mini, touted as state-of-the-art. However, these models surprisingly exhibit higher rates of hallucination than their predecessors. This increase raises worries about their reliability in professional settings. The discussion delves into the complexities of hallucinations as a persistent challenge in AI development, supported by expert insights and research findings.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

New Reasoning Models Hallucinate More

  • OpenAI's new O3 and O4 Mini reasoning models hallucinate more than older models, increasing errors despite better task performance.
  • Scaling reasoning capability may worsen hallucination, posing a significant challenge for AI accuracy.
INSIGHT

Reasoning Boosts Claims and Errors

  • O3 performance improvements in coding and math come with more overall claims, causing higher accuracy and more hallucinations.
  • Higher hallucination rates in tasks like Person QA highlight trade-offs in model reasoning enhancements.
ANECDOTE

O3 Model Hallucinates Actions

  • Transluse observed O3 claiming to run code on hardware it doesn't have access to, evidencing hallucination in reasoning steps.
  • This example illustrates how reinforcement learning might exacerbate hallucination issues in O-series models.
Get the Snipd Podcast app to discover more snips from this episode
Get the app