

Can Open-Source LLMs Compete With Proprietary Ones for Complex Diagnoses?
Apr 4, 2025
Arjun K. Manrai, from Harvard Medical School, discusses the intriguing findings of a recent study comparing open-source and proprietary large language models for complex medical diagnoses. He highlights how institutions can utilize custom open-source models while maintaining data privacy. The conversation dives into the competitiveness of models like LLAMA 3.1 against GPT-4, the implications for healthcare technology investment, and the critical role of AI in clinical practice, emphasizing the necessity of human oversight in ensuring diagnostic reliability.
AI Snips
Chapters
Transcript
Episode notes
Open-source LLMs Match Proprietary Models
- Open-source large language models have rapidly improved and now can match proprietary models in complex medical diagnosis tasks.
- The study compared Meta's LLaMA 3.1 model to GPT-4 using challenging clinical cases from Massachusetts General Hospital.
Local Open-Source Models Boost Privacy
- Proprietary models like ChatGPT run queries on external servers, raising data privacy concerns in healthcare.
- Open-source models can run locally on hospital infrastructure, offering better security and customization.
Hospital GPUs Can Run Big Models
- Running large open-source models requires significant GPU resources, available to some hospitals but not most individuals.
- At Brigham and Women's Hospital, 8 A100 GPUs are sufficient to run Meta's LLaMA 3.1, demonstrating feasibility at some institutions.