Economical way of serving vector search workloads with Simon Eskildsen, CEO Turbopuffer

10 snips

Sep 19, 2025

Simon Eskildsen, CEO and co-founder of Turbopuffer, has a rich background as an infrastructure engineer at Shopify. In this discussion, he reveals how Turbopuffer offers cost-effective solutions for vector search workloads, highlighting their innovative use of S3-native storage. Simon dives into the importance of latency optimization and the architectural advantages that result from their unique design. He also shares insights on the challenges of multi-tenancy, the significance of recall monitoring, and how LLM tools impact coding practices.

Ask episode

AI Snips

Chapters

Books

Transcript

Episode notes

ANECDOTE

Prototype Built From A Cost Problem

Simon built the first TurboPuffer prototype alone after a Readwise recommendation project proved cloud memory costs untenable.
Cursor later became an early customer and helped define features like mutable indexes that made the product production-ready.

INSIGHT

New Workload Needs New Storage

Two ingredients make a new database: a new workload and a new storage architecture that changes economics.
Object storage plus selective hot caching can make vector workloads orders of magnitude cheaper than DRAM-only designs.

ADVICE

Minimize Round Trips To Storage

Design the system to minimize round trips to object storage and saturate network/disk bandwidth per request.
Aim for a few round trips (three to four) so cold reads stay practical and warm reads match in-memory latencies.

Get the Snipd Podcast app to discover more snips from this episode

Get the app