
Talk Python To Me #528: Python apps with LLM building blocks
88 snips
Nov 30, 2025 Vincent Warmerdam, a Python developer and educator behind CalmCode and Marimo, dives into the practical integration of LLMs into Python apps. He emphasizes treating LLMs as just another API, with specific boundaries and focused monitoring to enhance reliability. Topics include efficient caching of LLM responses, the benefits of using structured outputs with Pydantic, and the contrasting advantages of Marimo over Jupyter. Vincent also explores ways to improve productivity through ergonomic workflows and local model experimentation.
AI Snips
Chapters
Transcript
Episode notes
LLMs Are Unstable Building Blocks
- Treat LLMs as unpredictable building blocks that need defensive boundaries and monitoring.
- Wrap them with clear interfaces and testable evaluation to avoid unexpected behavior in apps.
Cache LLM Calls To Save Cost
- Cache identical LLM requests on disk to save cost and time using tools like diskcache.
- Persisting responses across runs enables fast comparisons and reproducible evaluation of models.
Key Cache By Model, Prompt, And Settings
- Use the model, settings, and prompt as a cache key and optionally include an integer to store multiple stochastic outputs.
- This lets you retrieve previous variants quickly and record many samples for evaluation.
