

Measuring LLMs with Jodie Burchell
15 snips Apr 3, 2025
Dr. Jodie Burchell, a developer advocate in data science at JetBrains and former lead data scientist at Verve Group Europe, discusses measuring large language models (LLMs). She dives into various benchmarks and the importance of accuracy, reliability, and customization for specific topics. The conversation highlights the challenges in building effective test suites and emphasizes that smaller, targeted models can often outperform larger counterparts. Jodie also explores the complexities of evaluating AI performance with humor and insight.
AI Snips
Chapters
Transcript
Episode notes
LLM Hype vs. Reality
- LLMs excel in specific contexts, but generalizing their capabilities is misleading.
- The hype around LLMs has declined, but adoption is still high.
Large vs. Small LLMs
- Smaller, specialized LLMs offer advantages in cost, efficiency, and control over data.
- Larger models may connect dots better, but smaller models offer accuracy in specific contexts.
Reinventing the Wheel
- Carl Franklin experienced an LLM reinventing the wheel instead of using existing framework tools.
- This highlights the need for human oversight in software development.