Debunking myths of vector search and LLMs with Leo Boytsov

13 snips

Jan 17, 2025

In this intriguing conversation, Leo Boytsov, a Senior Research Scientist at AWS AI Labs and expert in vector search algorithms, shares enlightening insights from the cutting edge of search technology. He discusses the evolution of retrieval algorithms, challenges with large document handling, and how non-metric spaces can enhance similarity representation. Leo also reveals the potential of combining traditional and modern search methodologies, and the serendipitous discoveries shaping new industries in AI. A must-listen for tech enthusiasts!

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

Leo's Career Journey

Leo Boytsov's career began in finance but shifted to his passion, retrieval algorithms.
He worked at Yandex, PubMed, obtained a PhD at CMU, and now researches at AWS, focusing on question answering.

INSIGHT

Sparse vs. Dense Vectors

Combining sparse and dense vector representations can improve retrieval quality.
However, dense vectors have limitations with long documents and diverse vocabularies.

INSIGHT

Limitations of SPLADE

SPLADE uses subword tokenization, creating a fixed-size vector representation.
This approach limits its ability to handle rare terms and long documents effectively.

Get the Snipd Podcast app to discover more snips from this episode

Get the app