
Ep. #7, The March 2023 Datadog Outage with Laura de Vesine
Heavybit Podcasts
00:00
The Impact of a Global Event on Kubernetes Infrastructure
It took almost two hours to understand that what had happened was an impact to our actual computing infrastructure, the Kubernetes nodes that we run. I'm going to say it was around 3.30 in the morning that we sort of came to that realization that that was what had happened. And now we just needed to repair all of that. We made the call to start with our EU stack because it was starting to be morning there that seemed like the most sensible place to start.
Transcript
Play full episode