Super Data Science: ML & AI Podcast with Jon Krohn cover image

847: AI Engineering 101, with Ed Donner

Super Data Science: ML & AI Podcast with Jon Krohn

00:00

LMSYS: A New Approach to AI Evaluation

This chapter explores the rebranded LMSYS leaderboard, now lmarina.ai, which utilizes human evaluations and introduces an innovative competitive game called 'Outsmart' to assess large language models. Through this framework, insights into model performance and strategic interactions are unveiled, highlighting their collaboration and competition capabilities.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app