The GeekNarrator

How do vector (search) databases work? ft: turbopuffer

Apr 7, 2025
Simon Eskildsen, Co-founder of TurboPuffer and former infrastructure builder at Shopify, dives into the fascinating world of vector databases. He discusses the transformative role of vector search in enhancing recommendation systems, alongside challenges like cost and scaling. Simon also shares insights on managing podcast episode archives using embeddings and indexing strategies. The conversation highlights the importance of observability in database performance and paints an exciting picture of future trends in vector search technology.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
ANECDOTE

Simon’s Vector Search Backstory

  • Simon Eskildsen shares his experience building vector search for Readwise and the high costs that held back product adoption.
  • He ties this to his Shopify infrastructure background and the motivation behind creating TurboPuffer to reduce cost and complexity.
INSIGHT

Vectors Capture Semantic Relationships

  • Vectors or embeddings are points in high-dimensional space where semantically related items cluster together.
  • LLMs transform data into embeddings, helping capture semantic meaning beyond raw text.
ADVICE

Chunk Data and Use Clustering

  • Choose chunk size wisely when creating embeddings to balance search precision and data size.
  • Use approximate nearest neighbor (ANN) indexes and clustering techniques to improve query performance at scale.
Get the Snipd Podcast app to discover more snips from this episode
Get the app