

How do vector (search) databases work? ft: turbopuffer
Apr 7, 2025
Simon Eskildsen, Co-founder of TurboPuffer and former infrastructure builder at Shopify, dives into the fascinating world of vector databases. He discusses the transformative role of vector search in enhancing recommendation systems, alongside challenges like cost and scaling. Simon also shares insights on managing podcast episode archives using embeddings and indexing strategies. The conversation highlights the importance of observability in database performance and paints an exciting picture of future trends in vector search technology.
AI Snips
Chapters
Transcript
Episode notes
Simon’s Vector Search Backstory
- Simon Eskildsen shares his experience building vector search for Readwise and the high costs that held back product adoption.
- He ties this to his Shopify infrastructure background and the motivation behind creating TurboPuffer to reduce cost and complexity.
Vectors Capture Semantic Relationships
- Vectors or embeddings are points in high-dimensional space where semantically related items cluster together.
- LLMs transform data into embeddings, helping capture semantic meaning beyond raw text.
Chunk Data and Use Clustering
- Choose chunk size wisely when creating embeddings to balance search precision and data size.
- Use approximate nearest neighbor (ANN) indexes and clustering techniques to improve query performance at scale.