
How AI Is Built #030 Vector Search at Scale, Why One Size Doesn't Fit All
10 snips
Nov 7, 2024 Join Charles Xie, founder and CEO of Zilliz and pioneer behind the Milvus vector database, as he unpacks the complexities of scaling vector search systems. He discusses why vector search slows down at scale and introduces a multi-tier storage strategy that optimizes performance. Charles reveals innovative solutions like real-time search buffers and GPU acceleration to handle massive queries efficiently. He also dives into the future of search technology, including self-learning indices and hybrid search methods that promise to elevate data retrieval.
AI Snips
Chapters
Transcript
Episode notes
Vector Databases Reach Internet Scale
- Milvus has supported deployments at up to ~100 billion vectors, approaching internet scale.
- Embedding dimensions have grown from ~512 to 1,000–2,000 depending on application.
Distributed Design Beats App-Layer Sharding
- Scaling vector DBs requires designing distributed systems from the ground up, not just sharding at the application layer.
- Large embeddings increase network and consistency complexity compared to relational rows.
Use A Buffer For Real-Time Fresh Data
- Put new vectors into a write buffer and serve brute-force searches from it for real-time visibility.
- Trigger background index builds when the buffer reaches thresholds and merge results with main indexes.
