The Backend Engineering Show with Hussein Nasser cover image

Detailed analysis on the facebook outage

The Backend Engineering Show with Hussein Nasser

00:00

How to Turn a Million Servers Off at the Same Time?

Even employees could, couldn't figure this out. So we gan to put level f security for you to change these. It took extra time to activate the secure access proticles needed to get people on sight and able to work on the servers. Only then could we confirm the issue and bring the backbone on line. Once our backbone network on activity was restored across our dary centre, everything came back up with it. But the problem was not over. We knew that flipping our services back on, so they did turn them physically off at once, could potentially cause a new round of crashis due to surge of traffic. This is why rolling re starts exist. You never start all of

Transcript
Play full episode

Remember Everything You Learn from Podcasts

Save insights instantly, chat with episodes, and build lasting knowledge - all powered by AI.
App store bannerPlay store banner
Get the app