.NET Rocks! cover image

.NET Rocks!

Measuring LLMs with Jodie Burchell

Apr 3, 2025
Dr. Jodie Burchell, a developer advocate in data science at JetBrains and former lead data scientist at Verve Group Europe, discusses measuring large language models (LLMs). She dives into various benchmarks and the importance of accuracy, reliability, and customization for specific topics. The conversation highlights the challenges in building effective test suites and emphasizes that smaller, targeted models can often outperform larger counterparts. Jodie also explores the complexities of evaluating AI performance with humor and insight.
01:01:00

Episode guests

Podcast summary created with Snipd AI

Quick takeaways

  • Evaluating large language models requires structured assessments like unit tests and A-B testing to ensure accurate performance evaluation.
  • The relevance of benchmarks for assessing AI models is crucial, as traditional methods may not effectively reflect task-specific abilities.

Deep dives

D-Day and the Birth of Antibiotics

The discussion highlights the significance of D-Day in 1944, marking a pivotal moment in World War II with the largest amphibious invasion in history on the beaches of Normandy. Following the invasion, penicillin was mass-produced for the first time, yielding 2.5 million doses in response to the need for treating injured soldiers. This event underscores how wartime necessities propelled advancements in medicine, particularly the development of antibiotics. The liberation of Paris in August and the subsequent Battle of the Bulge in December further illustrate the Allied forces' relentless efforts against Nazi occupation.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode