Changelog Master Feed

Kaizen! Let it crash (Changelog & Friends #124)

14 snips
Jan 17, 2026
Gerhard Lazu, a reliability expert and Kaizen contributor, returns to discuss the nuances of 'let it crash' philosophy and how it boosts system resilience. He dives into troublesome out-of-memory crashes and shares insights on bandwidth spikes that challenge Varnish. Their humorous investigation into a wildly popular podcast episode reveals surprising patterns of downloads from Asia, leading to questions of scraping or user behavior. Plus, Gerhard showcases innovative tools to monitor system health, paving the way for efficient troubleshooting.
Ask episode
AI Snips
Chapters
Books
Transcript
Episode notes
INSIGHT

Supervision Beats Defensive Code

  • Let-it-crash shifts failure handling out of application code into supervisory systems to keep the whole system running.
  • Erlang/Elixir style supervision lets you focus on unique app logic instead of defensive try/catch boilerplate.
ANECDOTE

Thread-Level Crashes Kept Service Alive

  • Varnish crashes were per-thread: the thread using too much memory was killed while the daemon and VM kept running.
  • The thread restarted within seconds and service stayed available, just with a cold cache.
INSIGHT

Fragmentation Makes Free Memory Useless

  • Large MP3 objects cause memory fragmentation so new objects can't fit despite nominal free memory.
  • Forced evictions (LR nukes) spike when Varnish can't place big files into fragmented memory.
Get the Snipd Podcast app to discover more snips from this episode
Get the app