

No really, some bugs aren’t real
Sep 18, 2025
41:43
When is a bug not really a bug? In this episode, host David Wynn talks with SRE veteran Dan Slimmon about a radical idea: chasing perfect code might not be the best way to make your service reliable.
Dan argues that once your code is "good enough," most outages aren't caused by code defects. They're caused by weird interactions between different parts of a system or by users doing things you would never expect. He shares wild stories from his career, including how a tiny database hiccup created a massive, repeating traffic jam and how a single user crashed servers by uploading a 3.2-gigabyte config file.
This conversation will make you rethink what you thought you knew about bugs, quality, and what "reliability" truly means.