Georgia Tech's Santosh Vempala Explains Why Language Models Hallucinate, His Research With OpenAI

10 snips

Oct 14, 2025

Santosh Vempala, a distinguished professor at Georgia Tech, dives deep into the complexities of language models and their notorious ability to hallucinate. He explains how maximum likelihood pre-training can lead to these issues and the crucial trade-offs between memorization and generalization. Through fascinating examples, he discusses how calibration impacts accuracy and presents a formal theorem linking hallucinations to misclassification. Vempala also highlights practical approaches to detect invalid model outputs and shares insights into improving AI evaluation methods.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Concrete Definition Of Hallucination

Hallucinations are defined as plausible but invalid outputs drawn from a set of possible generations.
Valid, invalid, and plausible-but-invalid subsets clarify what we mean by hallucination.

INSIGHT

Pre-Training Encourages Hallucination

Pre-training encourages models to match the training distribution, which can produce hallucinations when asked about unseen facts.
Post-training reduces hallucination but can break calibration.

ANECDOTE

Alan Turing Example Shows Calibration Tradeoff

Santosh gave a toy dataset where half the training examples repeat Alan Turing's death and the rest are random names and dates.
A calibrated model fitting that data will hallucinate 50% of the time on unseen name/date pairs.

Get the Snipd Podcast app to discover more snips from this episode

Get the app