Open source, on-disk vector search with LanceDB (Practical AI #250)

8 snips

Dec 19, 2023

Chang She, Co-founder and CEO of LanceDB, discusses their open source, on-disk, embedded vector search offering. They talk about the unique columnar database structure that enables serverless deployments and drastic savings without performance hits at scale. They also explore the potential applications and benefits of autonomous vehicles and edge computing technology, as well as exciting developments in the practical AI space.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

ANECDOTE

LanceDB's Origin Story

LanceDB originally started as a data infrastructure solution for computer vision projects, not as a vector database.
The vector search capabilities were later separated and popularized for generative AI use cases by the open source community.

ADVICE

Getting Started with LanceDB

To start with LanceDB, install via pip or npm and load your data as a pandas or polars dataframe.
Configure embedding models within LanceDB and perform fast vector search returning results as dataframes, integrating seamlessly into ML workflows.

INSIGHT

Separation of Compute and Storage

LanceDB separates compute and storage, enabling scalable vector search on commodity compute.
This architecture allows queries on large datasets stored on cloud storage with efficient resource use and simpler deployment.

Get the Snipd Podcast app to discover more snips from this episode

Get the app