#528: Python apps with LLM building blocks

101 snips

Nov 30, 2025

Vincent Warmerdam, a Python developer and educator behind CalmCode and Marimo, dives into the practical integration of LLMs into Python apps. He emphasizes treating LLMs as just another API, with specific boundaries and focused monitoring to enhance reliability. Topics include efficient caching of LLM responses, the benefits of using structured outputs with Pydantic, and the contrasting advantages of Marimo over Jupyter. Vincent also explores ways to improve productivity through ergonomic workflows and local model experimentation.

Ask episode

AI Snips

Chapters

Transcript

Episode notes

INSIGHT

LLMs Are Unstable Building Blocks

Treat LLMs as unpredictable building blocks that need defensive boundaries and monitoring.
Wrap them with clear interfaces and testable evaluation to avoid unexpected behavior in apps.

ADVICE

Cache LLM Calls To Save Cost

Cache identical LLM requests on disk to save cost and time using tools like diskcache.
Persisting responses across runs enables fast comparisons and reproducible evaluation of models.

ADVICE

Key Cache By Model, Prompt, And Settings

Use the model, settings, and prompt as a cache key and optionally include an integer to store multiple stochastic outputs.
This lets you retrieve previous variants quickly and record many samples for evaluation.

Get the Snipd Podcast app to discover more snips from this episode

Get the app

#528: Python apps with LLM building blocks

LLMs Are Unstable Building Blocks

Cache LLM Calls To Save Cost

Key Cache By Model, Prompt, And Settings

Links from the show