Talk Python To Me

#534: diskcache: Your secret Python perf weapon

40 snips
Jan 13, 2026
Vincent Warmerdam, a skilled data scientist and developer at Marimo, delves into the world of caching with DiskCache. He discusses the significance of caching in ML workflows, explaining how it saves time and costs. The conversation explores comparing traditional systems like Redis to SQLite’s innovative usage for persistent caches. Vincent shares insights on advanced features like sharding, memoization, and custom serialization, providing practical advice on designing effective cache keys. Tune in for practical tips and the future of caching in Python.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

Persisted Caches Save Costly Recompute

  • Caching prevents repeated expensive computations like LLM calls or heavy SQL queries by reusing prior results.
  • DiskCache persists cached objects to disk so cache survives process restarts and shares across processes.
ADVICE

Cache Derived Results, Not Just DB Rows

  • Put derived or expensive-to-compute artifacts (like parsed YouTube IDs or HTML fragments) into DiskCache to avoid repeating work.
  • Store caches on a shared persistent volume so container rebuilds keep the cache across deployments.
INSIGHT

Dictionary Semantics With SQLite Backing

  • DiskCache behaves like a dictionary backed by SQLite and pickles arbitrary Python objects for persistence.
  • It optimizes native types (ints etc.) to avoid unnecessary pickle overhead.
Get the Snipd Podcast app to discover more snips from this episode
Get the app