Vector Podcast

Debunking myths of vector search and LLMs with Leo Boytsov

13 snips
Jan 17, 2025
In this intriguing conversation, Leo Boytsov, a Senior Research Scientist at AWS AI Labs and expert in vector search algorithms, shares enlightening insights from the cutting edge of search technology. He discusses the evolution of retrieval algorithms, challenges with large document handling, and how non-metric spaces can enhance similarity representation. Leo also reveals the potential of combining traditional and modern search methodologies, and the serendipitous discoveries shaping new industries in AI. A must-listen for tech enthusiasts!
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Leo's Career Journey

  • Leo Boytsov's career began in finance but shifted to his passion, retrieval algorithms.
  • He worked at Yandex, PubMed, obtained a PhD at CMU, and now researches at AWS, focusing on question answering.
INSIGHT

Sparse vs. Dense Vectors

  • Combining sparse and dense vector representations can improve retrieval quality.
  • However, dense vectors have limitations with long documents and diverse vocabularies.
INSIGHT

Limitations of SPLADE

  • SPLADE uses subword tokenization, creating a fixed-size vector representation.
  • This approach limits its ability to handle rare terms and long documents effectively.
Get the Snipd Podcast app to discover more snips from this episode
Get the app