

RAG Benchmarks with Nandan Thakur - Weaviate Podcast #124!
8 snips Jun 25, 2025
Nandan Thakur, a Ph.D. student at the University of Waterloo, dives deep into Retrieval-Augmented Generation (RAG) and its significant benchmarks like BEIR and MIRACLE. He discusses the evolution of embedding models and the balance between specialization and generalization. The conversation highlights advancements in query decomposition, emphasizing new methods for complex user queries. Nandan also explores the complexities of summarizing AI search results and the importance of nuanced evaluations in RAG benchmarks for real-world applications.
AI Snips
Chapters
Transcript
Episode notes
Origin of BEIR Benchmark
- Nandan Thakur shared how the BEIR benchmark originated to bridge the gap between IR and NLP communities.
- BEIR helped compare models trained on different domains by providing a zero-shot benchmark for diverse search tasks.
Challenges of General Embedding Models
- Creating a general-purpose embedding model for multiple domains is difficult due to domain knowledge compression challenges.
- Large language models may be a solution as they carry vast cross-domain knowledge inherently.
Leveraging Query Rewriting
- Query rewriting and decomposition can help handle complex queries by breaking them into smaller searches.
- Ideally, embedding models will improve to answer queries directly without rewriting in the future.