Can Open-Source LLMs Compete With Proprietary Ones for Complex Diagnoses?

Apr 4, 2025

Arjun K. Manrai, from Harvard Medical School, discusses the intriguing findings of a recent study comparing open-source and proprietary large language models for complex medical diagnoses. He highlights how institutions can utilize custom open-source models while maintaining data privacy. The conversation dives into the competitiveness of models like LLAMA 3.1 against GPT-4, the implications for healthcare technology investment, and the critical role of AI in clinical practice, emphasizing the necessity of human oversight in ensuring diagnostic reliability.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Open-source LLMs Match Proprietary Models

Open-source large language models have rapidly improved and now can match proprietary models in complex medical diagnosis tasks.
The study compared Meta's LLaMA 3.1 model to GPT-4 using challenging clinical cases from Massachusetts General Hospital.

INSIGHT

Local Open-Source Models Boost Privacy

Proprietary models like ChatGPT run queries on external servers, raising data privacy concerns in healthcare.
Open-source models can run locally on hospital infrastructure, offering better security and customization.

ANECDOTE

Hospital GPUs Can Run Big Models

Running large open-source models requires significant GPU resources, available to some hospitals but not most individuals.
At Brigham and Women's Hospital, 8 A100 GPUs are sufficient to run Meta's LLaMA 3.1, demonstrating feasibility at some institutions.

Get the Snipd Podcast app to discover more snips from this episode

Get the app