Latent Space: The AI Engineer Podcast cover image

ICLR 2024 — Best Papers & Talks (Benchmarks, Reasoning & Agents) — ft. Graham Neubig, Aman Sanger, Moritz Hardt)

Latent Space: The AI Engineer Podcast

CHAPTER

Evaluating Language Model Contamination

This chapter examines the challenges and advancements in evaluating language models, focusing on benchmark contamination and detection techniques. It discusses the implications of training data contamination on model performance and presents novel statistical methods for identifying such issues.

00:00
Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner