How AI Is Built

#031 BM25 As The Workhorse Of Search; Vectors Are Its Visionary Cousin

5 snips
Nov 15, 2024
David Tippett, a search engineer at GitHub with expertise in BM25 and OpenSearch, delves into the efficiency of BM25 versus vector search for information retrieval. He explains how BM25 refines search by factoring in user expectations and adapting to diverse queries. The conversation highlights the challenges of vector search at scale, particularly with GitHub's massive dataset. David emphasizes that understanding user intent is crucial for optimizing search results, as it surpasses merely chasing cutting-edge technology.
Ask episode
Chapters
Transcript
Episode notes