Talk Python To Me

#528: Python apps with LLM building blocks

88 snips
Nov 30, 2025
Vincent Warmerdam, a Python developer and educator behind CalmCode and Marimo, dives into the practical integration of LLMs into Python apps. He emphasizes treating LLMs as just another API, with specific boundaries and focused monitoring to enhance reliability. Topics include efficient caching of LLM responses, the benefits of using structured outputs with Pydantic, and the contrasting advantages of Marimo over Jupyter. Vincent also explores ways to improve productivity through ergonomic workflows and local model experimentation.
Ask episode
AI Snips
Chapters
Transcript
Episode notes
INSIGHT

LLMs Are Unstable Building Blocks

  • Treat LLMs as unpredictable building blocks that need defensive boundaries and monitoring.
  • Wrap them with clear interfaces and testable evaluation to avoid unexpected behavior in apps.
ADVICE

Cache LLM Calls To Save Cost

  • Cache identical LLM requests on disk to save cost and time using tools like diskcache.
  • Persisting responses across runs enables fast comparisons and reproducible evaluation of models.
ADVICE

Key Cache By Model, Prompt, And Settings

  • Use the model, settings, and prompt as a cache key and optionally include an integer to store multiple stochastic outputs.
  • This lets you retrieve previous variants quickly and record many samples for evaluation.
Get the Snipd Podcast app to discover more snips from this episode
Get the app