

Atropos Health’s Arjun Mukerji, PhD, Explains RWESummary: A Framework and Test for Choosing LLMs to Summarize Real-World Evidence (RWE) Studies
Sep 22, 2025
Join Arjun Mukerji, PhD, a staff data scientist at Atropos Health, as he dives into the RWESummary benchmark for evaluating large language models in summarizing real-world evidence. Discover how these models differ from traditional clinical trial data and the importance of robust evaluation metrics. Arjun sheds light on the risks associated with AI-generated summaries and advocates for a human-in-the-loop approach to ensure accuracy. It's a captivating discussion on the future of AI in healthcare!
AI Snips
Chapters
Transcript
Episode notes
RWE Fills The Evidence Gap
- Real-world evidence (RWE) complements clinical trials by covering larger, more diverse patient groups excluded from trials.
- RWE is crucial because only ~14% of daily medical decisions rely on high-quality evidence, creating an evidence gap.
LLM Summaries Are Production-Critical
- Atropos Health places an LLM-generated summary prominently atop every prognostogram PDF they produce.
- The team treats that summary as vitally important and has run it in production for years.
Structured Prompts Beat PDF Dumps
- Atropos uses structured, tokenizable representations of study outputs instead of dumping full PDFs to LLMs.
- This reduces the need for models to sift long contexts and focuses evaluation on decoding structured results.