Clémentine Fourrier

Lead maintainer of HuggingFace’s OpenLLM Leaderboard, a platform for standardizing and reproducing LLM evaluations.

Best podcasts with Clémentine Fourrier

Ranked by the Snipd community

Jul 12, 2024 • 58min

Benchmarks 201: Why Leaderboards > Arenas >> LLM-as-Judge

Clémentine Fourrier, lead maintainer of Hugging Face’s OpenLLM Leaderboard, shares her journey from geology to AI. She discusses the urgent need for standardized benchmarks in model evaluations as traditional metrics become outdated. Clémentine tackles the challenges of creating fair, community-driven assessments while addressing biases and resource limitations. She also highlights innovations like long-context reasoning benchmarks and predicts future advancements in LLM capabilities, emphasizing the importance of calibration for user trust.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner