#534: diskcache: Your secret Python perf weapon

40 snips

Jan 13, 2026

Vincent Warmerdam, a skilled data scientist and developer at Marimo, delves into the world of caching with DiskCache. He discusses the significance of caching in ML workflows, explaining how it saves time and costs. The conversation explores comparing traditional systems like Redis to SQLite’s innovative usage for persistent caches. Vincent shares insights on advanced features like sharding, memoization, and custom serialization, providing practical advice on designing effective cache keys. Tune in for practical tips and the future of caching in Python.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

Persisted Caches Save Costly Recompute

Caching prevents repeated expensive computations like LLM calls or heavy SQL queries by reusing prior results.
DiskCache persists cached objects to disk so cache survives process restarts and shares across processes.

ADVICE

Cache Derived Results, Not Just DB Rows

Put derived or expensive-to-compute artifacts (like parsed YouTube IDs or HTML fragments) into DiskCache to avoid repeating work.
Store caches on a shared persistent volume so container rebuilds keep the cache across deployments.

INSIGHT

Dictionary Semantics With SQLite Backing

DiskCache behaves like a dictionary backed by SQLite and pickles arbitrary Python objects for persistence.
It optimizes native types (ints etc.) to avoid unnecessary pickle overhead.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

#534: diskcache: Your secret Python perf weapon

Persisted Caches Save Costly Recompute

Cache Derived Results, Not Just DB Rows

Dictionary Semantics With SQLite Backing

Links from the show