Deep Papers

Arize AI

Deep Papers is a podcast series featuring deep dives on today’s most important AI papers and research. Hosted by Arize AI founders and engineers, each episode profiles the people and techniques behind cutting-edge breakthroughs in machine learning.

Episodes

Mentioned books

Oct 14, 2025 • 31min

Georgia Tech's Santosh Vempala Explains Why Language Models Hallucinate, His Research With OpenAI

Santosh Vempala, a distinguished professor at Georgia Tech, dives deep into the complexities of language models and their notorious ability to hallucinate. He explains how maximum likelihood pre-training can lead to these issues and the crucial trade-offs between memorization and generalization. Through fascinating examples, he discusses how calibration impacts accuracy and presents a formal theorem linking hallucinations to misclassification. Vempala also highlights practical approaches to detect invalid model outputs and shares insights into improving AI evaluation methods.

Sep 22, 2025 • 26min

Atropos Health’s Arjun Mukerji, PhD, Explains RWESummary: A Framework and Test for Choosing LLMs to Summarize Real-World Evidence (RWE) Studies

Join Arjun Mukerji, PhD, a staff data scientist at Atropos Health, as he dives into the RWESummary benchmark for evaluating large language models in summarizing real-world evidence. Discover how these models differ from traditional clinical trial data and the importance of robust evaluation metrics. Arjun sheds light on the risks associated with AI-generated summaries and advocates for a human-in-the-loop approach to ensure accuracy. It's a captivating discussion on the future of AI in healthcare!

Sep 6, 2025 • 48min

Stan Miasnikov, Distinguished Engineer, AI/ML Architecture, Consumer Experience at Verizon Walks Us Through His New Paper

This episode dives into "Category-Theoretic Analysis of Inter-Agent Communication and Mutual Understanding Metric in Recursive Consciousness." The paper presents an extension of the Recursive Consciousness framework to analyze communication between agents and the inevitable loss of meaning in translation. We're thrilled to feature the paper's author, Stan Miasnikov, Distinguished Engineer, AI/ML Architecture, Consumer Experience at Verizon, to walk us through the research and its implications.Learn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.

10 snips

Sep 5, 2025 • 31min

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Deep Papers

Episodes

Mentioned books

Georgia Tech's Santosh Vempala Explains Why Language Models Hallucinate, His Research With OpenAI

Atropos Health’s Arjun Mukerji, PhD, Explains RWESummary: A Framework and Test for Choosing LLMs to Summarize Real-World Evidence (RWE) Studies

Stan Miasnikov, Distinguished Engineer, AI/ML Architecture, Consumer Experience at Verizon Walks Us Through His New Paper

Small Language Models are the Future of Agentic AI

Watermarking for LLMs and Image Models

Self-Adapting Language Models: Paper Authors Discuss Implications

The Illusion of Thinking: What the Apple AI Paper Says About LLM Reasoning

Accurate KV Cache Quantization with Outlier Tokens Tracing

Scalable Chain of Thoughts via Elastic Reasoning

Sleep-time Compute: Beyond Inference Scaling at Test-time

The AI-powered Podcast Player