14min chapter

Risky Business cover image

Special Edition: Chris Krebs, Alex Stamos and Patrick Gray

Risky Business

CHAPTER

Cloud Provider Outages and Infrastructure Resilience

Exploring the potential repercussions of catastrophic outages on major cloud providers like Microsoft, including the challenges faced by large companies, survivability during disruptions, and risks associated with the vast amount of data. Discussing government understanding of cloud provider risks, supply chain security complexities, implications of data privacy laws on various industries, challenges in AI legislation, and Microsoft's dominance in technological solutions. Highlighting the fragility of cloud systems and the interconnectedness compared to government agencies.

00:00
Speaker 2
I mean, just going back to what Chris said then, that I'm sort of swinging back more towards your position, which is, you know, even though under the intra-ID model, everyone in the world is essentially on the same directory, like the chances of China being able to vape that directory are pretty slim, right? So, you know, then you are talking about these paper cuts and you were talking about, you know, we're back to that old problem statement of just trying to control it as best we can. I don't know.
Speaker 1
I mean, if they were able to knock out a significant amount of Microsoft's infrastructure, that would, I mean, the economic impact would be just pretty significant. Yeah. Absolutely. I find it highly unlikely, even with all the crap Microsoft's gone deservedly over the last couple of weeks, that you could just nuke all of Azure AD. But I'm sure you could cause a decent amount of disruption. And if 90% of the employees of large companies in the United States can't log in, that is, we've never tested that from an economics perspective, but it would probably not be that great, right? Every morning, nobody can get their email.
Speaker 2
It's an un scheduled public holiday for everyone except the security team.
Speaker 1
Yeah. Exactly. And I've had
Speaker 3
these conversations over the last probably 10 years with the major cloud players. And I think you would expect them to say this, but when you really, really push them, and I don't mean just ask them, but you really spend a lot of time pushing them on developing a scenario that would result in a catastrophic outage. They're really unable, particularly for a long term, they're unable to generate that scenario, one that's that they find believable. And so I think that goes Alex's point is like, you know, the survivability and the resilience of the major cloud providers is actually, it's not bad. The question is, do our, you know, like, when I say SLAs, I mean it more like from a casual perspective, like, what's our S society expectation of the SLA with the cloud providers? And what can we deal with, you know, from a business sense, like, yeah, you know, it's a couple days will be out, like, couple days in the real world. Uh, that's not going to cut it. Things start dropping out. And that, that information space is going to get filled with a lot of nonsense. And the bad guys will take advantage of
Speaker 1
it. I also think if somebody's intentionally, if they know what they're doing and they're intentionally taking out a cloud provider, I think it'd be more than a couple days. We, we simulated this at Facebook of we out, we tested outages all the time. The infrastructure team there is incredible. And they would do things like pull the plug on an entire data center to see what happens. Right. So like entire global data center would be shut down to make sure that everything else could adapt. And we simulated, well, what if it's not an accidental thing, like a data center, go down, but somebody had the ability to push instructions to millions of production hosts and RM dash RF. Um, and the problem is for hyperscale, hyperscalers, like the cloud providers, or the, you know, the Googles, the Facebooks, the Microsofts and such, is that the way you bring up a data center is you copy it out of another data center. Right. Yeah.
Speaker 2
I mean, I was, I was the whole time you're talking about this. I'm going back. I can't remember who I was who first pointed this out. It might have been Dan gear actually years ago, which is saying if one of these major cloud providers fell over the amount of data they hold, yeah, we don't have the networks that would be capable of transferring it to one of the other providers and they can't scale up to meet it. Anyway, I'm sure we all remember when COVID first kicked off and lockdowns happened, teams started falling over because as you didn't have the capacity to speed up
Speaker 3
more. And I think that's the issue. Like getting an entire hyperscale cloud provider to fall over is, is a stretch. Yeah, but getting a couple regions and then the cascade over into the remaining regions, you're going to see significant performance degradation to the point where things will start, you know, to your point, timing out and dropping off, but right. And then, yeah, but
Speaker 2
is that risk? Hang on, hang on. This is like right at the core of what we're talking about. Is that risk, do you think Chris well understood in government?
Speaker 3
Ah, that's a good question. I would suggest in certain areas more like in, you know,
Speaker 2
well, in NSA, yes, well, I think
Speaker 3
it's probably limited to a couple places. One would be in some areas of the Pentagon, some areas of the intelligence community, some areas of the federal communications commission, and then our good friends, it's SISA. But it's, it's beyond that. I tell you right now, man, the complexity of the United States economy is far beyond anything. I think the US government has the capability to truly understand. And we saw that in COVID. So many little pieces broke. We saw that with change healthcare. Right? Didn't see that one come in, did we? Like it's there are there are systemically important pieces of infrastructure out there that we don't have a full understanding of how they fit into the bigger equation. And that that's what I by the way, so that's what I set, we set up the National Risk Management Center for was to get to the bottom of what are those national critical functions. So I think we've defined them, but the real analysis of, okay, like we know the things that are important, do we know who's providing them? That's the big information gap knowledge gap. I
Speaker 2
guess, you know, this comes back to the whole premise of this conversation, right? Which is, okay, you got China and Russia turning their back on Western technology. The United States is in a fortunate position because it owns most of the supply chain. You know, most of the its own supply chain is either American or, you know, comes from allied countries, right? I guess the point is though, owning the supply chain isn't enough to make it secure. Just because you own it doesn't mean you can trust
Speaker 3
it. Is it a state on enterprise?
Speaker 2
I know I'm talking about your Microsoft. Yeah. I'm talking about, you know, but that's the point. They're not state owned enterprises. If we were living in the upside down, and Microsoft was Chinese, they would be controlling the absolute crap out of it. And that that's what leads us into this next part of
Speaker 1
the conversation. A lot of that American point. Which is Taiwan.
Speaker 2
Certainly in hardware. Yes.
Speaker 1
Like we talked about those Microsoft machines, those Amazon machines, most of them are manufactured by Taiwanese OEMs like Quanta and such, right? Like they're not buying Dell. And so I, yeah. But
Speaker 2
I guess this leads into the, into a part of the conversation I know you're keen to have, right? Which is when SolarWinds experienced an incident at the hands of Russia's SVR, there was hell to pay. They will never be able to separate their brand from the incident. They are, they are being pelted with rocks as they walk down the main, main street wearing a sandwich board that says shame. Microsoft has the same sort of stuff happened to it. And nothing. It's like they're skating. I mean, with the exception of this CSIB report, which seems to be a, you know, a decent step in the same direction. But this comes back to this whole conversation about sovereignty and supply chains. You know, the United States has its own supply chain that it seems unable to really strongly influence China doesn't, but it's trying to spin one up that it can. You know, so there's, it just seems to me that there's a bunch of interesting stuff happening, you know, around these.
Speaker 3
Yeah, I think we are having also a very naturally, very kind of security focused conversation here, obviously. I mean, when you step back and look at it, with the top 10 most valuable companies in the world, eight of them are US companies, and then like six of those are tech. So, you know, we're doing something right. And to your point on, you know, the ability to influence, I think the government's still struggling with what are the right market interventions regulation is new net new legislation. I mean, we don't even have a federal privacy law. I've talked about this. Alex talks about this all the time.
Speaker 2
Well, there's one being floated at the moment. So that's progress.
Speaker 3
There's minimum one being has one has been floated every Congress for the last 10 Congresses. So it's like, I've seen this movie before. And at some point, retail, the retailers in the banks will come at it and they'll fight over it. And we'll see what happens.
Speaker 1
Right. Because everybody forgets that's privacy laws don't only apply to tech companies, right? There's a huge industry in the United States of traditional companies, like the retailers, the banks, especially financial services that collect a huge amount of user data and have been able to hide so far. That's, yes.
Speaker 3
So that's an entirely different podcast. And then you've got at the same time, you've got AI and half a dozen AI bills out there that I honestly don't. I'm not sure we'll go much of anywhere. But to your point, there's got to be much, I think there's got to be a lot more on the power of the purse side for the federal government. And at least this administration, well, the last one too, with section 880 and saying, hey, we're not going to buy Huawei. We're not going to buy all these other Chinese backed products using things like the binding operational directive to block in rip out Kaspersky. That was, by the way, with commerce's recent announcement that they're finally going to block it in the
Speaker 2
domestic commercial. I just saw that this morning. Yeah. Like a country wide ban on Kaspersky. It's
Speaker 3
like, I just like triple take. It was like, what year is it? What's happening right now? Yeah. I think I think influencing the domestic manufacturing base for tech in software is more than likely going to happen through the purchasing power of the Pentagon, for instance. It's in look, you know.
Speaker 2
But we've already established that Microsoft has kind of a monopoly on the type of tech that is used by organizations like that. There's no alternative to Excel. There's no tightly integrated business suite that combines identity that you can sync with on-prem with a cloud suite and teams and this and that. It's all beautiful. I'm a real skeptic when it comes to the purchasing power argument.
Speaker 3
Well, you would hope that the government through the federal acquisition regulations, whether it's FAR or DFAR, can keep leveling up on the requirements. Now, we all know that those requirements are heavily influenced by special interests and lobbyists and consultants and stuff like that. And they tend to be those with the biggest bucks that can do it. But we will see improvements. And my hope is that if that happens, let me be a little rose-colored glasses here. If we can get those standards up, if the procurement does require better performance, I don't think that these software companies are going to bifurcate. They're not going to have whatever they do to improve the federal government sale or SKU will likely be the same code base for the commercial. That's my hope. Well,
Speaker 2
you say that. But I mean, I recently had a conversation with someone from the US defense industrial base where there's a whole bunch of requirements for DIB companies to exchange unclass information, which at a much higher level of security. And there's a full Microsoft suite of products that check off those compliances. And it costs like 10 times as much as regular as you, because it's a specialty product. And it's not available to everybody else. I don't know that I'm as rose-colored on that as you. I
Speaker 3
think that's a different situation. But you do make a good one point, though. It's going to cost more. Whatever we do here, when we raise this level, cost is going to get baked in. And that's going to be how the companies come along. Because they say, like, OK, fine, we're going to spend more. We're going to spend more to secure the product. But you know, cybersecurity has to be an allowable cost in the contract. So there will be a price tag associated with whatever the outcome is. Alex,
Speaker 2
you got some thoughts on this?
Speaker 1
I want to go back to your question about which I think is back to does anybody in the government understand, like the fragility here? And I'm not sure, because I think the fragility of the cloud providers in particular is based upon the incredible operating leverage they have of the number employees they have versus the number of machines. And that operating leverage is 10 to 100 times better than any government agency, right? You couldn't take the Pentagon down, even if you had the best SVR team, because everything's heterogeneous, right? Like you got your AS 400s next year, Windows 2008 machines next year, Linux boxes, whereas the cloud providers are completely homogenous. And that is why you can have a 100,000 to one ratio of DevOps engineers or system administrators to systems. And so I think that's part of our challenge here is that one of the reasons that people love the cloud is it is cost effective. But the cost effectiveness also is why you have things like outages of all of us East one, or why you can have a global Google outage. When you don't hear about that from other companies, like generally big fortune 500s, the entire company doesn't go down. It's because everything's disconnected. And everything at Google is connected up to the same systems in the same code. And so I do, I am concerned of the operating leverage we get as a society from cloud is wonderful and great in lots of ways. And it's also something the bad guys have figured
Speaker 3
out exists. I want to add this just triggered something by the way, when I say when you ask like to see my government understand this, I left out, I looked out NIST and NTIA, I'm sorry. I'm sorry, Adam and Kevin. But look, NIST gets it and TIA, I think gets it as well. But but again, the question goes to what can they do about it? Yes. And a lot of these agencies are just in a place where they influence policy.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode