
Changelog Master Feed Kaizen! Let it crash (Changelog & Friends #124)
14 snips
Jan 17, 2026 Gerhard Lazu, a reliability expert and Kaizen contributor, returns to discuss the nuances of 'let it crash' philosophy and how it boosts system resilience. He dives into troublesome out-of-memory crashes and shares insights on bandwidth spikes that challenge Varnish. Their humorous investigation into a wildly popular podcast episode reveals surprising patterns of downloads from Asia, leading to questions of scraping or user behavior. Plus, Gerhard showcases innovative tools to monitor system health, paving the way for efficient troubleshooting.
AI Snips
Chapters
Books
Transcript
Episode notes
Supervision Beats Defensive Code
- Let-it-crash shifts failure handling out of application code into supervisory systems to keep the whole system running.
- Erlang/Elixir style supervision lets you focus on unique app logic instead of defensive try/catch boilerplate.
Thread-Level Crashes Kept Service Alive
- Varnish crashes were per-thread: the thread using too much memory was killed while the daemon and VM kept running.
- The thread restarted within seconds and service stayed available, just with a cold cache.
Fragmentation Makes Free Memory Useless
- Large MP3 objects cause memory fragmentation so new objects can't fit despite nominal free memory.
- Forced evictions (LR nukes) spike when Varnish can't place big files into fragmented memory.



