Super Data Science: ML & AI Podcast with Jon Krohn cover image

847: AI Engineering 101, with Ed Donner

Super Data Science: ML & AI Podcast with Jon Krohn

CHAPTER

LMSYS: A New Approach to AI Evaluation

This chapter explores the rebranded LMSYS leaderboard, now lmarina.ai, which utilizes human evaluations and introduces an innovative competitive game called 'Outsmart' to assess large language models. Through this framework, insights into model performance and strategic interactions are unveiled, highlighting their collaboration and competition capabilities.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner