Measuring LLMs with Jodie Burchell

15 snips

Apr 3, 2025

Dr. Jodie Burchell, a developer advocate in data science at JetBrains and former lead data scientist at Verve Group Europe, discusses measuring large language models (LLMs). She dives into various benchmarks and the importance of accuracy, reliability, and customization for specific topics. The conversation highlights the challenges in building effective test suites and emphasizes that smaller, targeted models can often outperform larger counterparts. Jodie also explores the complexities of evaluating AI performance with humor and insight.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

LLM Hype vs. Reality

LLMs excel in specific contexts, but generalizing their capabilities is misleading.
The hype around LLMs has declined, but adoption is still high.

INSIGHT

Large vs. Small LLMs

Smaller, specialized LLMs offer advantages in cost, efficiency, and control over data.
Larger models may connect dots better, but smaller models offer accuracy in specific contexts.

ANECDOTE

Reinventing the Wheel

Carl Franklin experienced an LLM reinventing the wheel instead of using existing framework tools.
This highlights the need for human oversight in software development.

Get the Snipd Podcast app to discover more snips from this episode

Get the app