Vector Podcast

Economical way of serving vector search workloads with Simon Eskildsen, CEO Turbopuffer

10 snips
Sep 19, 2025
Simon Eskildsen, CEO and co-founder of Turbopuffer, has a rich background as an infrastructure engineer at Shopify. In this discussion, he reveals how Turbopuffer offers cost-effective solutions for vector search workloads, highlighting their innovative use of S3-native storage. Simon dives into the importance of latency optimization and the architectural advantages that result from their unique design. He also shares insights on the challenges of multi-tenancy, the significance of recall monitoring, and how LLM tools impact coding practices.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
ANECDOTE

Prototype Built From A Cost Problem

  • Simon built the first TurboPuffer prototype alone after a Readwise recommendation project proved cloud memory costs untenable.
  • Cursor later became an early customer and helped define features like mutable indexes that made the product production-ready.
INSIGHT

New Workload Needs New Storage

  • Two ingredients make a new database: a new workload and a new storage architecture that changes economics.
  • Object storage plus selective hot caching can make vector workloads orders of magnitude cheaper than DRAM-only designs.
ADVICE

Minimize Round Trips To Storage

  • Design the system to minimize round trips to object storage and saturate network/disk bandwidth per request.
  • Aim for a few round trips (three to four) so cold reads stay practical and warm reads match in-memory latencies.
Get the Snipd Podcast app to discover more snips from this episode
Get the app