AWS Morning Brief

Corey Quinn
undefined
Nov 25, 2019 • 16min

Improving Customers by Stuffing Them Into Containers

AWS Morning Brief for the week of November 25th, 2019.
undefined
Nov 21, 2019 • 18min

Networking in the Cloud Fundamentals, Part 4

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptAn IPv6 packet walks into a bar. Nobody talks to it.Welcome back to what we're calling a networking in the cloud, a 12 week networking extravaganza sponsored by ThousandEyes. You can think of ThousandEyes as the Google maps of the internet. Just like you wouldn't dare leave San Jose to drive to San Francisco without checking to see if the 101 or the 280 was faster, businesses rely on ThousandEyes to see the end to end pads their apps and services are taking and for localized traffic stories that mean nothing to people outside of the Bay Area. This enables companies to figure out where are the slowdowns happening, where are the pile ups and what's causing issues. They use ThousandEyes to see what's breaking where, and importantly they share that data directly with the offending service providers to hold them accountable in a blameless way and get them to fix the issue fast, ideally before it impacts their end users.Learn more at thousandeyes.com. And my thanks to them for sponsoring this ridiculous podcast mini-series.This week we're talking about load balancers. They generally do one thing and that's balancing load, but let's back up. Let's say that you, against all odds, you have a website and that website is generally built on a computer. You want to share that website with the world, so you put that computer on the internet. Computers are weak and frail and often fall over invariably at the worst possible time. They're herd animals. They're much more comfortable together. And of course, we've heard of animals. We see some right over there.So now you have a herd of computers that are working together to serve your website. The problem now of course, is that you have a bunch of computers serving your website. No one is going to want to go to www6023.twitterforpets.com to view your site. They want to have a unified address that just gets to wherever it has to happen. Exposing those implementation details to customers never goes well.Amusingly, if you go to Deloitte, the giant consultancy's website, the entire thing lives at www2.deloitte.com. But I digress. Nothing says we're having trouble with digital transformation quite so succinctly.So you have your special computer or series of computers now that live in front of the computers that are serving your website. That's where you wind up pointing twitterforpets.com to, or www.twitterforpets.com towards. Those computers are specialized and they're called load balancers because that's exactly what they do; they balance load, it says so right there on the tin. They pass out incoming web traffic to the servers behind the load balancer so that those servers can handle your website while the load balancer just handles being the front door that traffic shows up through.This unlocks a world of amazing possibilities. You can now, for example, update your website or patch the servers without taking your website down with a back in five minutes sign on the front it. You can test new deployments with entire separate fleets of servers. This is often called a blue green deploy or a red black deploy, but that's not the important part of the story. But you can start bleeding off traffic to the new fleet and, "Oh my god, turn it off, turn it off, turn it off. We were terribly wrong. The upgrade breaks everything." But you can do that; turn traffic on, turn traffic off to certain versions and see what happens.Load balancers are simple in concept but they're doing increasingly complicated things. For instance, you're a load balancer. How do you determine which of the 200 servers that you're in front of that all do the same thing because they have the same website and the same application code running on them, how do you determine which one of those receives the next incoming request?There are a few patterns that are common. The first and maybe the simplest is called round robin. You'll also see this referred to as next in loop. Let's say you have four web servers. Your first request goes to server one. Your second request goes to server two. Server three and server four, and the fifth request goes back to server one. It just rotates through the servers in order and passes out requests as they commit.This can work super well for some use cases, but it does have some challenges. For example, if one of those servers get stuck or overloaded, piling more traffic onto it is very rarely going to be the right call. A modification of round robin is known as weighted round robin, which works more or less the same way, but it's smarter. Certain servers can get different percentages of the traffic.Some servers, for example across a wide variety of fleets can be larger than others and can consequently handle more load. Other servers are going to have a new version of your software or your website and you only want to test that on 1% of your traffic to make sure that there's nothing horrifying that breaks things because you'd fundamentally rather break things for 1% of your users then 100% of your users. Ideally you'd like to break things for 0% of your users, but let's keep this shit semi-real, shall we?You can also go with the least loaded metric type of approach. Some smarter load balancers can query each backend server or service that they're talking to about its health and get back a metric of some kind. If you wire logic into your application where it says how ready it is to take additional traffic, load balancers can then start making intelligent determinations as to which server to drop traffic onto next.Probably one of the worst methods you can use to determine how to pass out traffic to load balancers is random, which does exactly what you'd think because randomness isn't. There's invariably going to be clusters and hotspots and the entire reason you have a load balancer is to not have to deal with hot spots; one server's overloaded and screaming while the one next to it is bored, wondering what the point of all of this is.There are other approaches too that offer more deterministic ways of sending traffic over to specific servers. For example, taking the source IP address that a connection is coming from and hashing that. You can do the same type of thing with specific URLs where the hash of a given URL winds up going to specific backend services.Why would you necessarily want to do that? Well, in an ideal world, each of those servers is completely stateless and each one can handle your request as well as any others. Here in the real world, things are seldom that clean. You'll find yourself very often with state living inside of your application. So if you have a backend server that handles your first request and then your next request goes to a different backend server, you could be prompted to log in again and that becomes really unpleasant for the end user exper...
undefined
Nov 18, 2019 • 12min

A CloudFormation Feature of Great Import

AWS Morning Brief for the week of November 18th, 2019.
undefined
Nov 14, 2019 • 16min

Networking in the Cloud Fundamentals, Part 3

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.TranscriptThis episode of Networking in the Cloud is sponsored by ThousandEyes. Their 2019 Cloud Performance Benchmark Report is now live as of yesterday. Find out which Clouds do what well, AWS, Azure, GCP, Alibaba, and IBM Cloud all have their networking capabilities raced against each other. Oracle was not invited, because we are talking about actual Cloud providers here, not law firms. Get your copy of the report today at Snark.Cloud/realclouds. That's Snark.Cloud/realclouds. That's completely free. Download it, let me know what you think. I'll be cribbing from that in future weeks. Now, for the third week of our AWS Morning Brief Screaming in the Network, or whatever we're calling it, mini-series on how computers talk to one another. Let's talk about the larger internet.Specifically, we begin with BGP, or Border Gateway Protocol. This matters, because it's how different networks talk to one another. If you have a whole bunch of different computer networks gathered into a super network, or internet as some people like to call it, how do those networks know where each one lives? Now, from a home user perspective, or even in some enterprises, that seems like sort of a silly question, because it is. You have a network that lives on your end of things. You plug a single cable in, and every other network lives through that cable. When you're talking about large disparate networks though, how do they find each other? More to the point, because of how the internet was built, it's designed so that any single failure of another network can now be routed around. There are multiple paths to get to different places. Some biased for cost, some biased for performance, some biased for consistency. And all of those decisions have to be made globally. BGP is the lingua franca of how those networks talk to one another. BGP is also a hot mess.It's the routing protocol that runs the internet, and it's comprised of different networks in this parlance, autonomous systems, or AS's, and it was originally designed for a time before jerks ruled the internet, and that's jerks in terms of people causing grief for others, as well as shady corporate interests that are publicly traded on NASDAQ. There's no authentication tied to BGP. Effectively, it is trusted to contain correct data. There is no real signing or authentication that someone who announces something through BGP is authorized to do it, and it's sort of amazing the whole thing works in the first place, but what happens is, is when a large network with other networks behind it winds up doing an announcement, it says, oh, I have routes to these following networks. And it passes them on to its peers. They in turn pass those announcements on, oh, behind me. Then this way two hops is this other series of networks, and so on and so forth.Now this can cause hilariously bad problems that occasionally make the front page of the newspaper when a bad announcement gets out. A few years ago there was an announcement from an ISP that said, oh, all of YouTube lives behind us. That announcement should never have gone out, and their upstream ISP should have quashed it, and they didn't. So suddenly a good swath of the internet was trying to reach YouTube through a relatively small link. As you can imagine, TCP terminated on the floor. Not every link can handle exabytes of traffic. Who knew? That gets us to another interesting point. How do these large networks communicate with each other? You have this idea of one network talks to another network. Does money change hands? Well, in some cases, no. If traffic volumes are roughly equal and desirable on both sides, we'll have our networks talk to one another, and no money changes hands. This is commonly known as peering.  At that point, everything is mostly grand, because as traffic continues to climb, you increase the links. Both parties generally wind up paying to operate infrastructure on their own side and in between, and traffic continues to grow. Other times it doesn't work that way where you have one network with a lot of traffic, and another network that doesn't really have much of any, and people want to go from one end to the other. Very often this is known as a transit agreement, and money changes hands from usually the smaller network to the bigger network, but occasionally the other direction depending on the specifics of the business model, and at that point, every byte passing through is metered and generally charged for. Usually this is handled by large ISPs and carriers and businesses behind the scenes, but occasionally it spills out into public view. Comcast and Netflix, for example, have been having a fantastic public spat from time to time, and this manifests itself when there's congestion and you're on Comcast.If so, I'm sorry for you, and your Netflix stream starts degrading into lower picture quality. Occasionally it's skips or whatnot, and strangely whenever Comcast and Netflix come to an agreement, of course under undisclosed terms, magically these problems go away almost instantly. Originally this sort of thing was frowned upon. The FCC got heavily involved, but with the demise in the United States of network neutrality, suddenly it's okay to start preferring some traffic over others through a legalistic framework, and this has led to a whole bunch of either malfeasant behavior or normal behavior that people believe is malfeasant. And that doesn't leave anyone in a terrifically good place. I'm not here to talk about politics, but it does wind up leading to an interesting place, because there's an existential problem to the business model for an awful lot of ISPs out there. Because generally speaking, when you wind up plugging into your upstream provider, maybe it's Comcast, maybe it's AT&T, maybe it doesn't matter, but you're generally trying to use them as a dumb pipe to the internet.The problem is, is they don't want to be a dumb pipe. There's a finite number of dollars that everyone is going to pay for access to the internet, and that is a naturally self-limiting business model, so they're trying to add value with services that don't really tend to add much value at all. My wireless carrier for example, wants to sell me free storage, and an email address, and a bunch of other things that I just don't care about, because I already have an email solution that works out super well for me. My Cloud storage that I care about is either Dropbox, something in AWS or other nonsense. I don't need to have Verizon's Cloud storage, but they keep trying to find alternative business models. Some of these ways are useful and beneficial to everyone, and others are well to be honest, less so.Comcast for example, isn't going to build you a search engine that is going to rival Google, which is kind of weird on some level because if you take a look from a customer service perspective, C...
undefined
Nov 11, 2019 • 11min

EC2 Instances Now On Layaway

AWS Morning Brief for the week of November 11th, 2019.
undefined
Nov 7, 2019 • 16min

Networking in the Cloud Fundamentals, Part 2

About Corey QuinnOver the course of my career, I’ve worn many different hats in the tech world: systems administrator, systems engineer, director of technical operations, and director of DevOps, to name a few. Today, I’m a cloud economist at The Duckbill Group, the author of the weekly Last Week in AWS newsletter, and the host of two podcasts: Screaming in the Cloud and, you guessed it, AWS Morning Brief, which you’re about to listen to.Transcript An ancient haiku reads, "It's not DNS. There's no way it's DNS. It was DNS." Welcome to the Thursday episode of the AWS Morning Brief. What you can also think of as networking in the cloud. This episode is sponsored by ThousandEyes and their Cloud State Live Event Wednesday, November 13th from 11:00 AM until noon, Central Time. There'll be live streaming from Austin, Texas, the live reveal of their latest cloud performance benchmark where they pit AWS, Azure, GCP, IBM, and Alibaba cloud against each other from a variety of networking perspectives. Oracle Cloud is pointedly not invited. If you'd like to follow along, visit snark.cloud/cloudstatelive, that's snark.cloud/cloudstatelive, and thanks to ThousandEyes for their sponsorship of this ridiculous yet educational podcast episode.DNS, the domain name system, it's how computers translate numbers into something humans can understand when those humans have a first language that is not math. Put more succinctly if I want to translate www.twitterforpets.com into an IP Address of 1.2.3.4, I probably want a computer able to do that because humans find it easier to remember twitterforpets.com. Originally, this was done with a far more manual process. There was a file on every computer on the internet that was kept in sync with each other. The internet was a smaller place back then, a friendlier time and jerks who are trying to monetize everything at the expense of others were no longer lurked behind every shadow, so how does this service work?Well, let's go back to the beginning. When you look at a typical domain name, let's call it www.twitterforpets.com there's a hierarchy built in and it goes from right to left. In fact, if you pick any domain you'd like that ends .com, .net, .technology, .dev, .anything else you care about there's another dot at the end of it. That's right. You could go to www.google.com., and it works just the same way as you would expect it to. That dot represents the root and there are a number of root servers run by various organizations that no one entity controls scattered around the internet and they have an interesting job where their role is to resolve who is the authoritative responsible DNS server for the top-level domains. That's all that the root servers do.The top-level domains, in turn, have name servers that refer out to who is responsible for any given domain within that top-level domain and so on and so forth. You can have subdomains running at your own company. You could have twitterforpets.com but all of the engineering.twitterforpets.com domains are delegated to a subname server out and so on and so forth. It can hit ludicrous lengths if you'd like. Now, once upon a time, this was relatively straightforward because there were only so many top-level domains that existed; .com, .net, .org, .edu, .mil and so on and so forth, and the governing body, ICAN, decided, "You know what's great? Money," so they wound up, in turn, going for additional top-level domains that you could grab the .technology, .blog, .underpants for all I know, no one can keep them all in their head anymore and one leaps to mind of an incredibly obnoxious purchase by google.dev.Now, you can have anything you want .dev exist as a domain because Google has taken responsibility for owning that subdomain. Why is that obnoxious? Well, historically for the longest time on the internet, there were a finite number of top-level domains that people had to worry about. So internally, when people were building out their own environments, they would come up with something that was guaranteed never to resolve, .dev was a popular pick. You could put that to a local name server inside your firewall or you could even hard-code it on your laptop itself and it worked out super-well. Now, anyone who registers whatever domain you picked has the potential to set up a listener on their end. That is not just a theoretical concern. I worked at a company once that had their domain.com as their external domain and domain.net for their internal domain, which is reasonable, except for the part where they didn't own the .net version of their domain.Someone else did and kept refusing offers to buy it, so periodically, we would try and log into something internal while not being on the VPN, despite thinking that we were, and type a credential into this listener that is set up and immediately have to reset our credentials. It was awful. Try not to do that. If you use a development domain, make sure you own it, it's $12, everyone will be happier with this. Now, a common interview question that people love to ask when it comes to CIS Admins, SRS, DevOps, whatever we're calling them this week, is when I punch www.google.com into my web browser and I hit enter how does it translate that into an IP address?There're a lot of things you can hit, but by and large, the way that it works is something like this. Oh, and a caveat they love to add in because otherwise, this gets way more complicated, is every server involved has a cold cache, and we'll get to what that means in a bit, but at that point, your browser then says, "Oh, who has www.google.com?" It passes that query to the system resolver on your computer that goes through a series of different resolution techniques. It usually will check the /etc/host's file if it's on a Mac or a Linux style box, and if there isn't anything hardcoded in there, which there is it for purposes of this exercise, it queries the systems external resolver.  This is usually provided by your ISP, but you can also use Google's public resolvers 8.8.8.8 And 8.8.4.4, Cloudflare's 1.1.1.1, OpenDNSs, which is really weird and no one can remember them off the top of their head, but there're a lot of different options. When that gets queried, it's looks at that www.google.com because it has a cold cache its first question is great, "Who owns .com?" It queries the route name server. The route name server says, "Oh, .com is handled by the .com TLD authoritative servers," and it passes that out. The route name server then returns who's authoritative for.com to the resolver. The resolver says, "Great," and then queries is the authoritative name server for .com, "Who has www.google.com?" and it returns the authoritative name servers for google.com.Now, something strange if you were to actually try this yourself is that the answer to that question is generally ns1.google.com that sets up the opportunity for an infinite loop where oh, nsi.google.com. Ask .com, "Who has nsi.google.com?" except for the part that when it returns with that result specifically, it includes an IP address. That IP address is known as a glue record to break that circular dependency. Glue records are often one of those things that pop up in CIS Admin type interviews to prove the interviewer thinks they're smarter th...
undefined
Nov 4, 2019 • 11min

The Rain in Spain Falls Mainly on the Control Plane

AWS Morning Brief for the week of November 4th, 2019.
undefined
Oct 31, 2019 • 17min

Networking in the Cloud Fundamentals, Part 1

Links ReferencedThousandEyesThe Duckbill GroupLast Week in AWSScreaming in the CloudAWS Morning BriefTranscriptUDP. I'd make a joke about it, but I'm not sure you'd get it. This episode is sponsored by ThousandEyes. Think of ThousandEyes as the Google Maps of the internet. Just like you wouldn't dare leave San Jose to drive to San Francisco without checking if 101 or 280 was faster and yes, that's a very localized reference to San Francisco Bay area. Businesses rely on ThousandEyes to see the end to end paths their apps and services are taking from their servers to their end users to identify where the slowdowns are, where the pileups are hiding and what's causing the issues. They use ThousandEyes to see what's breaking where and importantly, they share that data directly with the offending service providers to hold them accountable and get them to fix the issue fast, ideally before it impacts end users. You'll be hearing a fair bit more about ThousandEyes over the next 12 weeks because Thursdays are now devoted to networking in the cloud. It's like screaming in the cloud, only far angrier.We begin today with the first of 12 episodes. Episode one, the fundamentals of cloud networking. You can consider this the AWS morning brief networking edition. So a common perception in the world of cloud today is that networking doesn't matter, and that perception is largely accurate. You don't have to be a network engineer the way that any reasonable systems or operations person did even 10 years ago, because in the cloud, the network doesn't matter at all until suddenly it does at the worst possible time, and then everyone's left scratching their heads.So let's begin with how networking works, because a computer in 2019 is pretty useless if it can't talk to other computers somehow. And for better or worse, Bluetooth isn't really enough to get the job done. Computers talk to one another over networks, basically by having a unique identifier. Generally, we call those IP addresses here in the path that this future has taken. In a different world, we would've gone with token ring and a whole bunch of other addressing protocols, but we didn't. Instead we went with IP, the unimaginatively named internet protocol, and with the current version of the internet protocol, version four, we're not talking about IPv6 because let's not kid ourselves, no one's really using that at scale despite everyone claiming that it's going to happen real soon now.So there are roughly 4 billion IP addresses and change, and those are allocated throughout effectively the entire internet. When this stuff was built back when it was just defense institutions and universities on the internet, 4 billion seemed like stupendous overkill. Now it turns out that some people have 4 billion objects on their person that are talking to the internet and all chirping and distracting them at the same time when you're attempting to have a conversation with them.So those networks are broken down into subnetworks or subnets, for lack of a better term. And they can range anywhere from a single IP address, which in CIDR, C-I-D-R parlance is a /32 to all 4 billion and change, which is a /0. Some common ones tend to be /24, which is 256 IP addresses, of which 254 are usable and you can expand that into 512 with a /23 and so on and so forth. The specific math isn't particularly interesting or important and it's super hard to describe without some kind of whiteboard. So smile, nod and move past that. So then you have all these different subnets. How do they talk to one another? I mean the easy way to think of it is, "Oh, I have one network, I plug it directly into another network and they can talk to each other."Well, sure in theory. In practice, it never works that way because those two networks are often not adjacent. They have to talk to something else, go through different hops to go from here to there to somewhere else, to somewhere else to finally the destination it cares about. And when you take a look at the internet as being this network that spans the entire world, well that turns into a super complicated problem because remember, the internet was originally designed to be something that could withstand a massive disruption generally in the terms of nuclear war where effectively large percentages of the earth were no longer habitable, had to be able to reroute around things and routing is more or less how that wound up working.The idea that you could have different paths to get to the same destination and that solves an awful lot. It's why the internet is as durable as it is, but also explains why these things are terrible and why everyone is super quick to blame the network. One last thing to consider is network address translation. They're private IP address ranges that are not reachable over the general internet, anything starting with a 10 for example, the entire 10/8 is considered private IP address space. Same with one 192.168, anything in that range is as well and anything between 172.16 and 172.20, give or take, if I'm wrong, don't at me. It's been a very long week and translating those private IP addresses into public IP addresses is known as network address translation or NAT. We're not going to get into the specifics of that at the moment, but just know that it exists.Now, most of the traditional networking experience doesn't come from working in the cloud. It comes from working in data centers, a job that sucks and some of the things that you learn doing that are tremendously impactful. They completely change how you view how computers work and in the cloud, that knowledge becomes invaluable. So let's talk a little bit about what it looks like in the world of cloud, specifically AWS, because AWS had effectively five years of uninterrupted non-compete time where no one else was really playing with cloud. So by the time everyone else woke up, the patterns that AWS had established were more or less what other people were using. This is the legacy of Rip Van Wrinkling through five years of cloud. If you don't want me to talk about AWS and talk about a different company instead, that other company should have tried harder.In AWS context, they have something known as a virtual private network or a VPC, and planning out what your network looks like in those environments is relatively challenging because people tend to make some of the same mistakes here as they did in data centers. For example, something that has changed is that common wisdom in a data center is that anything larger than a /23 or a subnet that has 512 IP addresses in it was a complete non-starter because at that point that is a large enough subnet that your broadcast domain or everything being able to talk to everything is large enough that it was going to completely screw over your switch. It would get overwhelmed. You'd wind up with massive challenges and things falling over constantly, so having small subnets was critical.Now in the world of cloud, that's not true anymore because broadcast storms aren't a thing that AWS and other reasonable cloud providers allows to happen. It winds up getting tamped down. There are rate limits. They do all kinds of interesting things that mean that this isn't really an issue. So if you want to have a massive flat n...
undefined
Oct 28, 2019 • 11min

Last of the JEDI

AWS Morning Brief for the week of October 28th, 2019.
undefined
Oct 21, 2019 • 10min

AWS CloudWatch Anomaly Wake-Up Calls

AWS Morning Brief for the week of October 21st, 2019.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app