Software Sessions cover image

Software Sessions

Latest episodes

undefined
Mar 23, 2021 • 54min

Scott Hanselman on What's .NET?

Scott is the Community Program Manager for the .NET team at Microsoft and the host of the Hanselminutes podcast. This episode originally aired on Software Engineering Radio. Personal Links and Projects: @shanselmanHanselminutesBlogYouTube Channel Related Links: .NET (How to get started)What is .NET?Common Language Runtime (CLR) OverviewNuGet (package manager)Techempower Benchmarks (ASP.NET core)Mono runtime (Cleanroom .NET implementation)Unity (Write games using C#)Xamarin (.NET Mobile apps).NET nanoFramework (embedded devices / microcontrollers).NET foundation.NET interactiveBlazor (Client web apps with .NET)Single file deployment and executableAll About Span: Exploring a New .NET MainstayVirtual Desktop is what VR needsSE Radio 394 Chris McCord on Phoenix LiveView Transcript: You can help edit this transcript on GitHub. Jeremy: [00:00:00] Today I'm talking to Scott Hanselman. He's a partner program manager at Microsoft, the host of over 750 episodes of the Hanselminutes podcast. And he's got a YouTube channel. He's got a lot of interesting videos. one of the series I really enjoy is one called computer stuff they didn't teach you. You're all over the place. Scott, welcome to software engineering Radio. Scott: [00:00:22] Thank you for having me. Yeah. I was an adjunct professor for a while and I realized that in all of my career choices and all of my volunteerism and outreach, it's just me trying to get back to teaching. So I just started a TikTok. So I'm like the old man on TikTok now. But it's turned out to be really great. And I've got to engage with a bunch of people that I would never have found on any other social platform. So whether you find me on my podcast, my blog, which is almost 20 years old now, my YouTube or my TikTok or my Twitter, it's pretty much the same person. I'm a professional enthusiast, but that's all done in the spare time because my primary job is to own community for visual studio. Jeremy: [00:01:02] That experience of being a teacher in so many ways is one of the big reasons why I wanted to have you on today, because I think in the videos and things you put out, you do a really good job of explaining concepts and just making them approachable. So I thought today we could talk a little bit about .NET and what that ecosystem is because I think there's a lot of people who they know .NET is a thing, they know it exists, but they don't really know what that means. So, maybe you could start with defining what is .NET? Scott: [00:01:37] Sure, so Microsoft is historically not awesome at naming stuff. And, I joke about the idea that 20 years ago, Microsoft took over a top level domain, right? Thereby destroying my entire .org plan to take over the world. When they made the name .NET they were coming out of COM and C and C++ involved component object models. And they had made a new runtime while Java was doing its thing in a new language and a new family of languages. And then they gave it a blanket name. So while, Java was one language on one runtime, which allowed us to get very clear and fairly understandable acronyms like JRE, Java, runtime, environment, and JDK... Microsoft came out of the gate with here's .NET and C# and VB and you know, COBOL.NET and this and that. And this is a multilanguage runtime. In fact, they called it a CLR, a common language runtime. So rather than saying here's C#, it's a thing. They said here's .NET. And it can run any one of N number of languages now. The reality is that if you look over time, Java ended up being kind of a more generic runtime and people run all kinds of languages on the Java runtime. .NET now has C#, which I think is arguably first among equals as its language. Although we also have VB and F# as well as a number of niche languages, but the .NET larger envelope name as a marketing thing remains confusing. If they had just simply said, Hey, there's a thing called C# it probably would have cleared things up. So, yeah. Sorry. That was a bit of a long answer and it didn't actually get to the real answer. Jeremy: [00:03:24] So, I mean, you had mentioned how there was this runtime that C# runs on top of. Is that the same as the JVM? Like the Java virtual machine? Scott: [00:03:36] Right. So as with any VM, whether it be V8 or the JVM, when you could take C# and you compile it, you compile it to an intermediate language. We call it IL. Java folks would call it a byte code and it's basically a process or non-specific assembler. For lack of another word. If you looked at the code, you would see things like, you know, load, you know, load four byte string into, you know, pseudo register type of stuff. And, um, you would not be able to identify that this came from C#. It would just look like this kind of middle place between apples and apple juice. It's kind of apple sauce. It's pre chewed, but not all the way. And then what's interesting about C# is that when you compile it into a DLL, a dynamic link library or Linux which is like a shared object where you compile it into an executable. The executable contains that IL, but the executable doesn't know the processor it's going to run on. That's super interesting, because that means then I could go over to an ARM processor or an Intel x86 or an x64 or whatever, an iPhone theoretically. And then there's that last moment. And then that jitter that just-in-time compiler goes and takes it the last mile, that local runtime, in this case, the CLR, the common language runtime takes that IL. Chews it finally up into apple juice. And then does the processor specific optimizations that it needs to do. And this is an oversimplification because the, the pipeline is quite long and pretty exquisite. In fact, for example, if you're on varying levels of processor on varying versions of x64, or on an AMD versus on a, uh, an Intel machine. There's all kinds of optimizations you can do as well as switches that we can do as developers and config files to give hints about a particular machine. Is this a server machine? Is this a client machine? Does it, is it memory constrained? Uh, does it have 64 processors? Can we prep this stuff both with a naive JIT a naive just-in-time compilation, or then over time, learn, literally learning over the runtime cycle of the process and go, you know, I bet we could optimize this better then swap out in while it's running a new implementation so we can actually do a multi-layered JIT. So here's the, I want to start my application fast and run this for loop. You know, this for loop's been running for a couple of thousand milliseconds and it's kind of not awesome. I bet you, we can do better then we do a second pass. Swap out the function table, drop a new one in, and now we're going even faster. So our jitter, particularly a 64 bit jitter is really quite extraordinary in that respect. Jeremy: [00:06:29] That's pretty interesting because I think a question that a lot of people might have is what's, what's the reason for having this intermediary language and not compiling straight to machine code. And what it sounds like, what you're saying is that, it can actually at runtime figure out like how can I run things more efficiently based on what's being executed in the program. Scott: [00:06:50] Right. So when you are running a strongly typed language that has the benefits and the speed of drawing type language. and you also choose to compile it to a processor specific representation. We, we usually as developers and deployers think about things in terms of architectures like this is for an arm processor, but is this a raspberry PI, is it an Android device you really want to get down to the stepping individual versions of like an Intel? For example, I'm talking to you right now on an Intel. But it's a desktop class processor. That was the first one released many, many years ago. It doesn't have the same instruction set as a modern i9 or a Xeon or something like that. So we have to be able to know what instruction sets are available and do the right thing. And then if you add in things like, um, SIMD or doing, you know, data parallelization at the CPU level, If that's available, we might treat a vector of floats differently than we would on another thing. So how much information up front do we know about deployment and is it worth doing all that work on the developer's machine? And this is the important part, which may not necessarily reflect the reality of where it's going to get deployed. You know, my desktop does not represent a machine in Azure or Heroku or AWS. So getting up to the point where we know as much as we can. And then letting the, the runtime environment, adapt to the true representation of what's happening on the, on the, on the final machine. That doesn't mean that we can't go and do native compilation or what's called end gen or native image generation that if we know a whole lot, but, um, it provides a lot of value if we wait until the end. Jeremy: [00:08:34] Yeah, that's an interesting point too, in that if we are compiling down to an intermediate language on our own development machines, then we don't have to worry about like, having all these different targets right. Of saying I'm going to build for this specific type of ARM processor you know, all these different places that our application could possibly go because the runtime has taken care of that for us. Scott: [00:08:58] Right. And I think that's one of those things where it's taking care of it for us, unless we choose to, it reminds me of the time. Uh, if you remember, when we went from, you know, mostly manual shift cars to automatic shift cars, but people who bought fancy Audis and BMWs were like, you know, when I'm going to the shops to get groceries, I'm cool with automatic, but I really like the ability to downshift. And then we started to see these hybrid cars that had, you know, D and R for drive in reverse and then a little move left and move right. To take control. And like, I'm going to tell the automatic transmission on this car that I really want to run at higher rpms. .NET tries to do that. It gives you both the ability to just say, you know, do what you're gonna do. I trust the defaults, but then a ton of switches, if you choose to really know about, uh, that, that ending environment. Jeremy: [00:09:50] We've been talking about the common language runtime, which I believe you said is the equivalent of the Java virtual machine. You also mentioned how there are languages that are built to target the intermediate language that, that runtime uses like C# and VB and things like that. And you touched on this earlier before, but I feel like we haven't quite defined what .NET really is like, is it that runtime and so on? Scott: [00:10:18] Okay. so .NET as an all encompassing marketing term, I would say is the ecosystem. Just as Java is a language and also an ecosystem. So when someone says, I know .NET, I would assume that they are at the very least familiar with C#. They know how to compile code and deploy code. They might be able to be a website maker, or they might be an iPhone developer. That's where .NET starts getting complicated. Because it is an ecosystem. Someone could work. Let's say two people could work for 10 years in the ecosystem, and one could be entirely microservices and containers on the backend, in the cloud. And the other person could be Android and Unity and games and Xbox, and they are both .NET Developers, except they never touched each other's environment. One is entirely a game and mobile app developer and the other one is entirely a backend developer. That's where it's harder now. I think the same thing is true in the Java world. Although I would argue that you'd be surprised how many places that you can find .NET, whether it be, in a Python notebook, you can run .NET And C# and an IPYMB in a notebook and Jupiter um, on, you know, on the Google cloud or you can run it on a microcontroller, not a microprocessor, but literally on a tiny 256 K micro controller, all of that is, is .NET. Now .NET is also somewhat unique, uh, amongst ecosystems in that it has a very large, what we call BCL the base class library. One of the examples might be that we know with a Python or with a JavaScript. You say console.log, hello world. We got the same thing. And then it's like, Hey, I need a really highly optimized, a vector of floats with a complicated math processing. Well, you have to go and figure out which one of those libraries you want to pull in. Right? There's lots of options in the Python world. There's lots of options in the Java world, the JavaScript world, but then you have to go hunting for the right library. You have to start thinking. Microsoft, I think is somewhat more prescriptive in that our base class library, the stuff that's included by default has everything from regular expressions to math, to, you know, representations of memory all the way up to putting buttons on the screen. So you don't always have to go out to the third party until it's, uh, it's time to get into higher level abstractions. And that has made .NET historically attractive to enterprises because when you're picking a framework, And it has a whole ton of ecosystem things. Who do you throttle when it doesn't work? Right? There's an old joke that people picked IBM because they got, it gave them one throat to choke. Uh, people who are in enterprises typically will buy a Microsoft thing and then they'll buy a support license and then they can call Microsoft when it is. Um, now fast forward for 20 years, Microsoft has a balance between a strong base class library and enrich open-source environment. So it's a little bit more complicated, but the historical context behind why Microsoft ended up in enterprises is because of that base class library and all of those libraries that were available out of the box. Jeremy: [00:13:20] And that's sort of in direct contrast to something like say JavaScript where people are used to going to the node package manager and getting a whole bunch of different dependencies that have been developed by other people. It sounds like with .NET, the philosophy is more that Microsoft is going to give you most of what you'll need and you can sort of stay in that environment. Scott: [00:13:46] I would say that that was the original philosophy. I would say that that was pre open source Microsoft. So maybe 10 and 15 years ago. I would think that that is not a philosophy that is very conducive to a healthy, open source ecosystem. But it was early on when it was like, here's a thing you kind of buy, you didn't really buy it, but like it comes with windows and you buy visual studio. When I joined about 13-14 years ago, myself and others are really pushing open source because we came from the open source community. And as we started introducing things like Azure and services .NET is free, has been free. Is now open source visual studio code, visual studio community are all free. So the stuff I work on, I actually have nothing to sell. As such, um, it's better to have a healthy community. So we have things like NuGet so NuGet and N U G E T is the same as NPM is the same as Maven. It's our package manager environment. And that is, uh, allows you to pull both Microsoft libraries and third-party libraries. And we want to encourage an open source community that then, you know, pays open source framework, uh, creators for their work using things like github sponsors. People who are working in enterprises now need to understand that if you pick just the stuff that Microsoft gives you out of the box, you may be limited in the diversity of cool stuff that's happening. But if you go and start using open source technology, You might need to go and pay a support contract for those different tools or frameworks, depending on whether it's a, you know, a reimagination of ASP.NET or web framework. And you say, I don't like the one Microsoft has. I'm going to use this one or an OAuth identity server that Microsoft does not ship out of the box. It's a missing piece. Therefore use this third party identity server. Jeremy: [00:15:33] What are some things that you or Microsoft has done to encourage people to build open source stuff for .NET? Scott: [00:15:43] Well, early on Microsoft was, uh, you know, attempted to open source .NET By creating a thing called rotor, R O T O R, which was a way of saying, Hey, Java, everyone in, in, in academia is using Java. We have one too, but. It didn't exist. The only real open source licenses at the time were GPL, so Microsoft made a very awkward, this is about 20 years ago, very awkward overture, where it's like, here's a zip file with a weird license that Microsoft made called the MSPL, the Microsoft permissive license. And here's a kind of a neutered version of .NET That you could potentially use. And it wasn't source it wasn't open source. It was source opened. It's like, here's a zip file. Like, go, go play with this. There were no take backs. Fundamentally Microsoft was not a member of the open-source community. They were just like, Hey, look, we have one too. When we started creating ASP.NET MVC, the model view controller stuff, which is kind of the Ruby on Rails for .NET. The decision was made by a bunch of folks on the team that I worked on and encouraged by me and others who think like me. Just open source the thing and do takebacks it's it's not just, Hey there's Hey, you can watch us write code and do nothing about it, but the takebacks is so important. And the rise of GitHub and social coding enabled that even more. So now actually about 60% of .NET comes from outside Microsoft. It is truly collaborative. Now Microsoft used to patent a bunch of stuff and they'd patent stuff all over the place and everyone would be worried. I don't want to commit to this because it might be patented, but then they made the patent promise to say that any of the patents that covered .NET won't be things that will be enforced. And then they donated .NET to the .NET foundation, which is a third party foundation, much like many other open source foundations and .NET and all of the things within it, the logos and the code and the copyrights are all owned by that. So if Microsoft, God forbid were to disappear or, uh, stop caring about .NET, it is the communities that can be built and deployed and run entirely outside of .NET. So it's overtures like that to try to make sure that people understand that this isn't going anywhere, it's truly open source and then take backs is such a big thing. And this has been going on now for, for many, many years, most people who think .NET and probably the folks that are listening to this show are thinking the big giant enterprise monolith that only runs on Windows that came out in the early 2000s. Now it is a lightweight cross-platform open source framework that runs in containers and runs on Linuxes and raspberry pis. It's totally open and it's totally open source all the way down. The compilers, open source, all the libraries, everything. Jeremy: [00:18:21] To your point of people, picturing .NET being closed source running only on windows. I think an important component to that is the fact that there have been multiple .NET runtimes right? There's been the framework there's been core. There's been mono. I wonder if you could elaborate a little bit on what each of those are and what role they fit in today. Scott: [00:18:46] That is a very, very astute and important observation. So if I were to take Java byte code, Uh, I've taken Java source code and I've turned it into Java byte code. There are multiple Java runtimes, Java VMs that one could call upon it's from Oracle and there's Zeus and there's different open source, open JBMs and whatnot. There are even interpreters. You can run that through a whole series of them. There's X. You can imagine a pie chart of all the different Java VMs that are available. Now. Certainly there's the 80% case, but there are lots of other ones. The same thing applied in the .NET ecosystem. There is the .NET framework that shipped and ships continually still with windows. And the .NET framework and its associated CLR, I would say is a, and I'm putting my fingers in quotes here, a .NET. But to your point, there are multiple .NETs. Plural. Mono is a really interesting one because it is an open source, clean room reimplementation of the original Windows-based .NET and clean room in that they did not look at the source code. They looked at the interface. So if I said to you, Jeremy, um, here's the interface for `System.String`, write me an implementation of this interface here are the unit tests and you wrote yourself Jeremy specific `System.String`. And it, it worked, it passed all the tests. Now, you know, who knows under load, it might behave differently. But for the most part, you made a, a clean room reimplementation of that. Mono did that for all of .NET proper, but then that became a new .NET. So for those that are listening, they can't see us or see a screencast. But imagine that I go out to the command line and with a Microsoft implementation of .NET, I make a new console app. It says `Console.WriteLine`. Then I compile it into an executable. I can then run that with Mono. So I created it on one environment and I ran it with another. So I, I did the compilation and the first, the first bit of chewing, and then the finally the jitter would run through mono because the IL language is itself a standard that has been submitted to the standards organizations. And it is a thing. So if you and I were, uh, you know, advanced computer science students in university, that might be an interesting homework assignment, right? A IL you know, interpreter or write an IL jitter. And if you can successfully consume IL that has been created by one of these many .NETs you have passed the course. Then .NET core is a complete reimplementation opensource cross-platform with no specific ties to windows. And that's the .NET that has kind of, we're hoping will be rebranded just .NET. So five years from now, people won't think about .NET core .NET framework. They'll just think about .NET. It will run great on windows. It'll run great on Linux it'll run great right in the cloud, yada yada, yada. That is a kind of a third .NET. We're trying to unify all of those with the release of .NET 5, which just came out and .NET 5 we picked the number because it's greater than 4, which was the last version of .NET on windows. But we also skipped a number because .NET Core 3. Would have been confusing if we named it .NET Core 4, again, Microsoft not being good at naming stuff. You imagine two lines heading to the right .NET framework and .NET core. They then converge into one .NET. The idea being to have one CLR, one common language runtime, one set of compilers and one base class library. So that if someone were to learn that. If we harken back to the middle of the beginning of our conversation, when I said that there was a theoretical .NET developer out there, who's a gamer and there's another one who's doing backend work. They should be using the same libraries. They should be using the same languages. They should have a familiarity with everything up to, but not including the implementation specific details of their systems, so that if a unity person or a, uh, decided to become a backend developer or a backend developer decides to move over to make iPhone games, they're going to say, Oh, This is all the system, dot this and that stuff I'm familiar with. That's a `System.String`. I appreciate that. As opposed to picking an iPhone or an Android specific implementation of a library. Jeremy: [00:23:09] If .NET core is the open source reimplementation of the .NET Framework, and you've also got, you were talking about how mono is this clean room implementation of it? What role does, does mono play, currently? Scott: [00:23:25] Hmm. Good question. So mano was originally created by, uh, Miguel de Icaza and the folks that were trying to actually make an outlook competitor on Linux that then turned into the Xamarin, uh, company X, A M A R I N and Xamarin, um, then was a different reimplementation, which allowed them to do interesting stuff like, can we take that IL and send it over to I'm doing a thing called SGEN, uh, and generating, you know, uh code that would run on an iPhone or code that we're running on an Android and they created a whole mobile ecosystem and then Microsoft bought them. So then, because mono and .NET work on the same team we're getting together right now and we're actively merging those together. Now, originally mono was really well known that it was a just nice, solid, clean C code. You could basically clone the repo run configure run make, and it would build while the windows, implementation of .NET Core and .NET Core in general, wasn't super easy to build it. Wasn't just a clone configure and make. A lot of work has been done to make the best parts of mono be the best parts of .NET core, as opposed to like disassembling one and, you know, taking them apart and making this frankenbuild, we want to make it so.net core is, is a, uh, a re-imagining of both original .NET framework and windows and the best of a cross platform one. So that is currently in the process in that .NET 5 .NET 6 wave. So this year in, um, in November, we will release .NET 6, and that will also be an LTS or a long-term support release. So those of you who are familiar with Ubuntu's concept of LTS those long short term long-term support versions will have, you know, many years, three years plus of support. And by then the mono .NET merge will kind of be complete. And then you'll find that .NET core will basically run everywhere that mono would. It doesn't mean mono as an ecosystem goes away. It just becomes another, uh, not the supported version of .NET, uh, at a runtime level. Jeremy: [00:25:31] When we look at the Java virtual machine and we look at that ecosystem, there are languages like Scala and Clojure, things that were not made by the creator of the JVM. But are new languages that target that runtime. And I wonder, from your perspective, why there was never that sort of community, with the common language runtime? Scott: [00:25:56] I would think that because the interest from academia early on 20 years ago was primarily around Java. And Microsoft was slow on the uptake from an open source perspective that, um, slowed down possible implementations of other languages. And then .NET got so popular and C# evolved because C# is now on version nine, the C# that you wrote the idiomatic C# of 2001 is not the idiomatic C# with includes, you know, records and async and await and interesting things that have shown up on C# and then been borrowed by other languages, async await, being a great example, linq language, integrated query, being another thing that didn't exist, that language has evolved fast enough that people didn't feel that it was worth the effort. And it was really F# that picked an entirely different philosophy in the, in the functional way, by doing kind of an Ocaml style thing. And because C# and F# are so different from each other, they really fit nicely. They're as different as English and another language that isn't Latin-based, you know, And they meaning that they add flavor to the environment. That doesn't mean that there aren't going to continue to be niche languages. Uh, but there was also things like IronRuby and IronPython, but they weren't really needed because the Python and Ruby runtimes already worked quite well. Jeremy: [00:27:16] We've been talking about .NET and sort of the parts that, make up .NET, but I was wondering, other languages have niches, right? You have go maybe for more systems type programming or Python for machine learning. Are there niches for .NET? Scott: [00:27:33] Oh, yeah. I mean, so when I would, I would, I would generally push back on the idea of niche though, because I don't want to use niche as a diminutive right. I kind of like the term domain specific just in the sense of you know, when in Finland speak Finnish, that doesn't mean that we would say, Hey, you know, Finnish didn't win. Like we're not all speaking. It's not the, it's not the language. That one. But we don't want to think about things like that. There are great languages that exist for a reason because people want to do different stuff with them. So for example a really well thought of language in the .NET space is F# and F# is a functional language. So for people who are familiar with OCaml and Haskell and things like that, and people who want to go and create a language that is, you know, very correct, very maintainable, very clear that has a real focus on you managing the problem domain. As opposed to the details of, of imperative programming. Um, it has immutable data structures by default. There's a lot more type inference than there is in C#. Um, a lot of people use F# in the financial services industry. and it also has some interesting language features like records and discriminated unions, uh, that, uh, C# people are just now kind of getting hip to. Jeremy: [00:28:47] What I typically see or what I think people sort of believe when they think of .NET is they think of enterprise applications. And I'm wondering if you, have some examples of things that are sort of beyond that space. Scott: [00:29:03] Sure. So when I think of .NET, like in the past, I would think of like big windows apps with giant rows of tabs and text boxes over data. You know, if I go to my eye doctor and I see the giant weird application that he's running, that's multiple colors and lot of grids and stuff. And I think, Oh, that must be written in .NET. Those are the applications of 20 years ago. The other thing that I think people think about when they think about .NET is it's a thing that is tied to windows and built on windows and probably takes about gigabytes of space. .NET core is meant to be more like node and more like go than it is to be like .NET Framework. So for example, if I were to create a microservice using .NET core, I go out to the command line and I type `dotnet new`. Web or `dotnet new empty`, or `dotnet new console`. And then I can say `dotnet publish` and I can bring in a runtime. And we talked before about how you could keep things very non-specific until the last minute and let the runtime on the deployed machine decide. Or you could say, you know, this is going to a raspberry PI, it's going to be a microservice on a raspberry PI. I want it to be as small as possible. So there's multiple steps. We've gone to, we made this little microservice and we published it. Then we publish all of the support, DLLs, the dynamic link libraries, the shared objects. If it's going to Linux all of the native bindings that, that provide that binding between the runtime and, the native processor and the native operating system, all the system calls then that is right now, about 150, 180 megs on .NET. But then we look at the tree of function calls. And we say, you know, this is a micro service. It only does like six things. It looks at the weather, it calls like six native functions. Then we do what's called tree trimming or tree shaking and we shake, we give it a good shake. And then we basically delete all of the functions that are never called. And we know those things, right. We can, we can introspect the code and we can shake this thing and we can get microservices down to 30, 40 megs. Now again, you might say, well, compared to go or compared to writing that in C. Sure. But you're talking about still maintaining the jitter, the garbage collector, the strongly typed abilities of .NET. All of the great features of this cool environment shook, shaken, shaken, shaken down to, you know, 30, 40 megs. And then you can pack that tightly into a container, even better if the container layer above it. And if you imagine a multi-stage Docker container, doesn't include the compiler. Doesn't include the SDK and the container above. It includes the runtime itself. You might only deploy, you know, 10, 15 megs. You can really pack nicely together a .NET microservice and then a whole series of them. Thousands of them potentially into something like Kubernetes. That's a very fundamentally different thing than the giant .NET application that took two gigs on your hard drive. This is the cool part, though. A giant, we call it windows forms or that WPF, that windows presentation application that often we find in large enterprises moves slowly because the not, not run slowly, but move slowly from an, um, from a dev ops perspective because the enterprise is afraid to update it, or they need to wait for windows to get the version of .NET that they want because .NET has historically been tied to windows. Which sucks then that application and the whole team behind it gets depressed. What if we could take the cool benefits that I just described around containers back port it, so that, that giant application at my doctor's office ships its own local copy of .NET, which it can then maintain, manage, and, and deal with it's it's on its own. All in a local folder. Doesn't need to wait for windows. That's really cool. Then they could do that, that, that prejudging basically do a native image generation and even create a single executable where inside is .NET. So then I give you my doctor's office dot exe. You double click on it. You install nothing. It simply unfolds like a flower into memory. Including all of the benefits. Now it's not generated native code at this point, right? It's .NET in an envelope that then blooms and then runs with all native dependencies that it needs. That is pretty cool. And that actually really generates excitement in the enterprises so that they can bring those older applications to, to modern techniques like dev ops and, uh, as well as a huge, huge perf speed up like two X, three X per speed up. Jeremy: [00:33:47] When you talk about a perf speed up, why would including the, the .NET runtime with the application result in a perf speed up? Scott: [00:33:55] It's not that we're including the, the, the .NET runtime that caused the perf speedup. It's that 20 years of improvement in jitter, 20 years of understanding the garbage collector, rather than using the .NET that shipped with windows, they can use the one that understands. Again, a callback to the beginning instruction sets that are newer and modern as SIMD, SSE2, all the different things that one could potentially exploit when on a newer system. Those things are we get huge numbers and actually .NET itself .NET 5 is 2 or 3x over .NET core three, which is 2 or 3x over. So a huge perf push has been happening and actually all the benchmarks are public. So if you go to techempower, the tech empower benchmarks, you can see that we're having a wonderfully fun kind of competition, a bit of a thumb war with, you know, the gos and the Java's and the, you know, groovy on rails of the world. To the point where we're actually bumping up against the laws of physics. We can put out, you know, seven or 8 million requests a second. And then you notice that we all bundle at the top and .NET kind of moves within a couple of percentage of another one. And the reason is, is that we've saturated the network. So without a 10 gigabit network, there's literally nothing, no more bytes can be pushed. So .NET is no longer in the middle or to the far right of the, uh, of the long tail of the curve. We're right up against anything else. And sometimes as fast as 10x, faster than things like node. Jeremy: [00:35:23] So it sounds like going from the .net framework to .net core and rebuilding that, uh, you've been able to get these, giant speed improvements. You were giving the example of a windows desktop application and saying that, uh, you could take this that was built for the .net framework, um, and run it on .NET core. I'm wondering, is that a case where somebody can just take their existing project, take their existing code and just choose to run it? Or is, does there have to be some kind of conversion process? Scott: [00:35:56] So that's a great question. So it depends on if you're doing anything that is like super specific or weird. Like if you were. A weird application, then, then you might bump up against an edge case. The same thing, the same thinking applies. If you were trying to take an application cross-platform or example, I did a tiny mini virtual machine and an operating system for a advanced class in computer science. 15, 20 years ago, I was able. To take that application, which was written and compiled in .net 1.1 and move it to .NET core and run in a container and then run on a raspberry PI because it was completely abstract. It was using byte arrays and, and, and list of T and kind of pretty straightforward base class library stuff. It didn't call the registry. It didn't call any windows APIs. So I was able to take a 15 year old app and move it in about a day. Another example, my blog, which used to run on .NET to my blog has over 19 years old now has recently been ported from .NET 2 to 4, then .NET core. And now .NET 5 runs in the cloud in Linux, in a container. With 80% of the code, the same, the stuff that isn't the same that I had to actually consciously work around with the help of my friend Mark Downey was, well, we called the registry. The registry is a windows thing. It doesn't exist. So what's the, you know, what's the implementation, is it a JSON file now? Is it an XML file? You know, when it runs in a container, it needs to talk to certain mapped file system things while before it was assuming. Okay. Back slashes or a C:. So implementation details need to move over. Another thing that is interesting to point out though, is that there are primitives, there are language and runtime primitives that didn't exist before that exists now. And one of the most important one is a thing that's called span of T `Span<T>`. And this is the idea of a completely new way of thinking about. Arbitrary continuous bits of memory. Like you want to be at a high level language and then messing around in bite arrays. Or do you want to think about things at that higher level, but then also re avoid reallocation and copying of memory around. So something very simple, like some HTTP headers come in, you know, if you're doing 7 million of these a second, this can add up right. And it's like, well, usually we take a look at that chunk of memory and we copy it into an array and then we start using higher level constructs to deal with that. Now I have a list of char and then we for loop around the list of char that's kind of gross. And then you find out that you're in the garbage collector a lot because you're allocating memory and you're copying stuff around. You could take that older code, update it to you, use span of T and represent that contiguous region of arbitrary memory differently rather than an array, a span you can point to native memory or even memory managed on a staff back and then deal with stuff. For example, parsing or maneuvering within HTTP header with no allocations. So a really great example, you'd ask for an example of an application that you couldn't believe. VR desktop. Are you familiar with VR desktop as it relates to the Oculus quest? Okay. So the Oculus quest is the Facebook VR thing, right? And it's an Android phone on your face and it's, you can use it unplugged. And then people say, well, I got this Android phone on my face. I'd really like to plug it in with a USB cable and use the power of my desktop. But then I got to ship those bits across the USB and then have a client application that allows me to use it as a dumb client. Okay. You would expect an application that someone would write to do that would be written in C or C++ or something really low level VR desktop allows you to actually see your windows desktop and ship those bits across the wire, across the literal wire, the USB or wirelessly is written entirely in C#. And you can use really smart memory allocation techniques. And the trick is don't allocate any memory and C# is more than fast enough to have less than 20 millisecond jitter on frames so that it is buttery smooth, even over wireless. And I did a whole podcast on that with the Guy Godin the Canadian, a gentleman who wrote VR desktop. Jeremy: [00:40:23] That's really interesting because you're talking about areas where typically when you think of a managed language, like, C#, you're able to use it in more environments than you, you might normally think of. And, I also kind of wonder, you have the new Apple machines coming out. They have their arm processors instead of the x86 processors. what is the state of.net when targeting an arm environment? Can somebody simply take that IL that's on one machine, bring it on to the arm machine and it just works? Scott: [00:41:01] Yep. So if you are putting, if you're taking over the, the IL and I'm on a windows machine right now, I'm on an X64 machine, and I go, `dotnet new console`. And I go, `dotnet run` and it works on my X64 machine. And I say, `dotnet publish`. I can then take that, that library over to the raspberry PI or an arm machine say `dotnet run`. And because I'm saying `dotnet run` with the local .NET runtime, it will pick up that IL and it will just run. So I have run using my now 19-20 year old blog. As an example, I have run that entire blog on a raspberry PI unchanged. Once I had worked out the portability issues, like once I had worked out the portability issues like back slashes versus forward slashes and assumptions like that. So once the underlying syscalls line up, absolutely. And we actually announced in July of 2020 that we will support .NET core on the new Mac hardware. And there are requirements for .NET and .NET core on, you know, Apple like arm 64 and big sur, that would be required to get that runtime working. But yes, the goal would be to have native processes versus Rosetta to processes working. And that work is happening. Jeremy: [00:42:24] We were talking about like the typical enterprise application, you've got a, windows forms application is it possible for somebody with an application like that to, to target arm? Scott: [00:42:36] So right now there is there's multiple layers. So the compiler yes. The runtime. Yes. The application. Yes. At the point where, if I were going to run on windows arm, like a surface pro X or something like that, we would need support for, I believe windows forms, winforms and WPF. I don't know that those are working right now, but those apps still run because just like we have the, the Rosetta, you know, Emulator can emulate x86 stuff. So for example, paint.net is a great.net application that is basically Photoshop, but it's four bucks, um, that it runs entirely on .NET core and it runs just fine on a surface pro X in an emulated state. And I believe that the intent of course is to get everything running everywhere. So yeah, I would be sad if it didn't. That's a little outside my group, but the people that are working on that stuff. And by the way, the great thing about this go and look at the GitHub issues, right? So, you know, if you look on, I'm looking right now on a .net issue, 4879, where they're talking about the plan is supporting .NET core 3 and 5 on Rosetta 2 via emulation and .NET 6 supported in both native on ARM architecture and Rosetta two. And there's in fact, a .NET runtime. A tracking issue. That's called support Apple Silicon, and they walk through where they're at current status on both mono and .NET six. Jeremy: [00:44:00] very cool. Scott: [00:44:01] Yeah, it is cool. It's cool. Because you get to like, actually see what we're doing, right? Like before you had to know somebody at Microsoft and then call your friend and be like, Hey, what's going on? We are literally doing the work in, open on GitHub. So if the, if the tech journalist would pay more attention to the actual commits, they would get like scoops weeks before an a blog post came out because you're like, I just saw that, you know, the check-in for, you know, they had an Apple Silicon show up in the .NET but-da-da. Jeremy: [00:44:26] Might get a little bit technical for the average journalist though. One of the things that you've been talking about is how with .NET Core, you have a lot of cross-platform capabilities. I wonder from an adoption standpoint, what you've seen, like are there people actually running their services and Linux and Docker, Kubernetes, that sort of thing. Scott: [00:44:48] Yeah .NET. Um, one of the things that we try to make sure that it works everywhere. So we've seen huge numbers, not only on Azure, but AWS has a very burgeoning, uh, .NET Community, as well as their SDKs all work on .NET. Same with Google cloud. we're trying to be really friendly and supportive of anyone because I want it to run everywhere. So for example, at our recent .NET conf, which is our little virtual conference that we make, we had Kelsey Hightower from Google cloud, like their main Google cloud Kubernetes person. Learn.net in a weekend, and then deploy to Kubernetes on Google cloud. We're seeing a huge uptick on .NET 5. So that unified version where we took.net core and the.net framework and unified them, it's the most quickly adopted .NET ever. I mentioned before that like 60% of the, commits are coming from outside the community. We're seeing big numbers, you know, stack overflow is moving to dot has moved to .NET core. They're actually showing their perf. And most of the games that you run are already running on .NET. And, you know, if you see a unity game, there's .NET and C# running in that a lot of the mobile games that you're seeing, um, one of the other things that we're noticing is that more and more students and young people are getting involved. Um, so we're really trying to put a lot of work into that. One of the other things that's kind of cool that we should talk about before the end is a thing called blazor. Have you heard about that? Jeremy: [00:46:07] Yeah. Maybe you could explain it to our audience. Scott: [00:46:09] So just like, um, if we use Ruby on rails, as an example, I'd like to use other frameworks as an example, like builds cool stuff on top of it. So it's like you pick your Ruby, whether it be JRuby or, and Ruby or whatever. And then you've got your framework on top of that and then other frameworks, and then you layer on your layer and your layer. .NET has this thing called ASP.NET active server pages. .NET. It's a stupid old name. You can make, web APIs that are JSON and XML based API APIs. You can make a standard model view controller type environments. You can do a thing called razor pages where you can basically mix and match C# and HTML. But if you want to do those higher level, enterprise applications where you just don't want to learn Java script. We've actually taken the CLR and compiled it to web assembly and we can run the CLR inside V8. But then if you actually hit F12 tools on a blazor page, laser is bla Zed, O R hit F 12. You go, and you hit a blazor page. You can see the DLLs, the actual dynamic link libraries coming over the wire. And then the intermediate language that we've been talking about so much on this podcast gets interpreted on the client side, which allows a whole new family of people. That already knows C#. Like those individuals that we spoke to spoke about at the beginning, you've got someone with 10 years experience in unity and someone else got 10 years experience on the backend. I never got around to learning JavaScript. They can write C# they can target the browser. And then the browser itself runs.net and then blazor allows them to create spa applications, single page applications, much like, you know, Gmail and Twitter and in PWAs, progressive web apps. So one could go and create an app. And rather than an electron kind of an app, which is using JavaScript and a whole host of tools, they can make a blazor app, have it run natively anywhere they can then pin it to their mobile phone or their taskbar on windows and they're there suddenly a web developer, and it's a really great environment for those big enterprise applications or those more line of business type apps, because the data binding is just so clean and then that backend can be anything. So blazor is a really great environment and a pretty interesting technical achievement to get .NET itself running within the context of the browser. Jeremy: [00:48:33] Does that mean that somehow you've been able to fit the .NET core runtime, in web assembly? I'm trying to understand how that works. Scott: [00:48:40] Yep. That's exactly. That's exactly right. So remember how we mentioned before that this isn't the multi gigabyte thing that everyone is so afraid of, you basically bring down. I think it's like 15 or 20 megs, which is, you know, a little bit big, but if you think about the homepage of the verge, it's not that big, right. I've seen PNGs bigger than the .net, uh, runtime. Uh, once you bring that down, of course, if you put that on a, on a CDN. that's handled. I think it's actually even smaller than that. It's not that big of a deal because once it's down, it's cached and that is really the .NET core, you know, mono runtime running within the context of WebAssembly and then you're just calling local browser things. But here's the cool part. Let's say you don't like any of that, let's say I've described this whole thing. You like the application model. I want to write C#. But you don't want to download and like put a lot of work into getting the local CPU and the local V8 implementation to do that. You can actually not run the .NET framework and the .NET core rather, pardon me on the client side, but you can have it run on the server side and then you open a socket between the front end and the backend. So then the events that occur. On the client side, get shuttled over web sockets to a stateful server. So there's blazor, which is client side, and then there's blazor server, which means it can run anywhere on any device. So then it's more of a programming model thing. You still don't have to write any JavaScript because you'll be addressing the Dom the document object model within the browser from the server side. The messaging then goes magically over the web sockets. You basically, you have a persistent channel bus running between the front end and the back end. So you get to choose. Jeremy: [00:50:28] yeah, we did a show on Phoenix live view, and it sounds like this is a very similar concept, except that you get the choice of having that run in the server or run locally in the client. Scott: [00:50:40] Yeah. And I think the folks on Ruby on rails are doing similar kinds of things. It's this idea of, you know, bringing HTML back, maybe getting a little less JavaScript heavy, and really thinking about shoving around islands of marked markup. Right. If I have a pretty basic, you know, master detail page and I click on something and I just want to send back a card. Is that really need to be a full blown react view? Angular application is the idea is you, you make a call to get some data, you ship the HTML back and you swap out a div. So it's trying to find a balance between, too much JavaScript and, uh, and too many post backs. And I think it, it hits a nice balance. Blazor is seeing huge pickup, particularly in people who have found JavaScript to be somewhat daunting. Jeremy: [00:51:27] I think we've been talking about things that are a little more cutting edge in terms of what's happening in the .NET ecosystem. Things like web assembly. what else are Microsoft's priorities for the future of .NET? Are you hoping to move into some more ahead of time compilation environments, For example, where you need something really small, that can run on IoT type devices. What's the future of .NET from your perspective? Scott: [00:51:57] All of the above. If you take a look at the microsoft .NET IOT, um, repository on github, we've got bindings for dozens and dozens and dozens of sensors and screens and things like that. We have partnerships with folks like PI top. So if you're teaching .NET, you can go in inside of your browser in a Jupiter kind of an environment, uh, go and, um, communicate with actual physical hardware and call GPIO general purpose IO directly. Uh, we're also partnering with folks like wilderness labs, which has a wonderful implementation of .NET Based on mono that allows you to write straight C# on a tiny micro processor, not a micro controller, pardon me uh, rather than a microprocessor and create, you know, commercial grade applications like, uh, you know, uh, uh, thermostat running in .NET, which again, takes those different people that I've mentioned before, who have familiarity with the ecosystem, but maybe not familiarity with the deployment and turns them into IOT developers. So there's a huge group of people that are thinking about stuff like that. We've also seen 60 frames a second games being written in the browser using blazor. So we are finding that it is performant enough to go and do you know, we've seen, you know, gradients and different games like that. Being implemented using the canvas, uh, with blazor server running it's, uh, at its heart, the intent is for it to be kind of a great environment for everyone anywhere. Um, and then if you go to the .NET website, which D O T.net dot.net, we can actually run it in the browser. And write the code and hit run so you can learn .NET without downloading anything. And then there's also some excitement around .NET interactive have notebooks, which allows you to use F# and C# in Jupiter notebooks and mix and match it with different languages. You could have JavaScript, D3JS, some SVG visualizations, using C# all in the browser entirely in the, uh, in the Jupiter notebooks IPYMB ecosystem. Jeremy: [00:54:00] So when you talk about, .NET Everywhere. It's really increasing the number of domains that you can use .NET at, and like be the person who maybe you built, uh, enterprise windows applications before, but now you're able to take those same skills and like you said, make a game or something like that. Scott: [00:54:20] Exactly. It doesn't mean that you can't or shouldn't be a polyglot, but the idea is that there's this really great ecosystem that already exists. Why shouldn't those people be enabled to have fun anywhere? I think that's a great note to end on. So for people who want to check out what you're up to, or learn more about .net, where should they head? Scott: [00:54:42] Well, they can find me just by Googling for Hanselman, Google, with being, if it makes you happy. Um, everything I've got is linked off of hanselman.com. And then if they're interested in learning more about .NET, they can go to just Google for .NET and you'll find dotnet.microsoft.com. Uh, within the download page, you'll see that we work on Mac, Linux, Docker, windows, and, um, if you already have visual studio code, just download the .NET SDK, go to the command line type `dotnet new console`, and open it up. And visual studio code VS code will automatically get the extensions that you need and you can start playing around. Feel free to ask me any questions. We've got a whole community section of our website, whether it be stack overflow or Reddit or discord or code newbies or tick tock, we are there and we want to make you successful. Jeremy: [00:55:29] Very cool. Well, Scott, thank you so much for chatting with me today. Scott: [00:55:34] Thank you. I appreciate your very deep questions and you clearly did your research and I appreciate that.
undefined
10 snips
Feb 25, 2021 • 58min

Distributed Systems and Careers with Shubheksha Jalan

Shubheksha is a Software Engineer at Apple. She previously worked at Monzo and Microsoft.Personal Links and Projects:@scribblingonLessons learnt in year three as a software engineerPersonal SiteSystems RecipesComputers IllustratedOther interviews about getting into open source:Code NewbieSoftware Engineering DailyRelated Links:Monzo Service GraphGo contextAn Illustrated Proof of the CAP theoremApache CassandraDistributed System Building Block: Flake IdsProtocol BuffersGo rpcConsulEnvoyWhat is a Runbook?Error Budgets ExplainedKubernetesGo by Example: ErrorsMusic by Crystal Cola.Jeremy: This is Jeremy Jung and you're listening to software sessions. Today, I'm talking to Shubheksha Jalan about working with distributed systems, the go programming language and some of the lessons she's learned being a software engineer for the last three years. Previously, she worked as a backend engineer at Monzo Bank. And before that she was a software engineer at Microsoft. Shubheksha, thank you so much for joining me today. Shubheksha: Thank you so much for having me.Jeremy: You picked up a lot of experience with distributed systems at Monzo bank. Could you start with explaining what a distributed system is?Shubheksha: Yeah. So the main premise and the main difference, between like, a distributed and a non distributed system, is that a distributed system has more than one node. And when I say a node, I mean just computers, machines, servers. So it's basically a network of interconnected nodes that is being asked to behave as a singular entity and present itself as, as something that's not really disjoint.And that's kind of where the trouble starts. something that you hear a lot of people who work with distributed systems say is that just don't do it unless you really really need to because like things spiral out of control very, very quickly. Because like, when you have a single machine, you don't have to think about like communication over the network.You don't have to think about like distributing storage. You don't have to think about so many other things, whereas with like, when you have multiple machines, you have to think about coordination. You have to think about communication and like at what levels and like what sort of reliability you need and all of that.So on and so forth. So it just gets super tricky.Jeremy: Yeah. And when we talk about distributed systems, you mentioned about how it's having multiple nodes. And I think a more traditional type of application is where you would expect to have a load balancer and some web servers behind the load balancer and having those talk to a database and those that involves multiple nodes or multiple systems. But, I guess in that case, would you consider that to be a distributed system or not really?Shubheksha: I think, yeah, like if something is going over the network and like, you have to sort of make those trade offs, I think I would still call it a distributed system. And like right now, like it's not just the big companies that are building and deploying distributed systems. It's pretty much everyone, like even small startups, especially like, because the cloud has become so dominant now, like the cloud is essentially a giant distributed system that we sort of take parts off and deploy our stuff, which is isolated from all of the other stuff that's also deployed on it. But like from the perspective of a cloud provider, it is a big big distributed system. And like, even when you, as a customer are using a cloud provider, you are, your code is deployed. You don't even know where, and it's still to you like if you spin up like multiple EC2 instances, for example, that is a distributed system.Jeremy: It's almost not so much that the different nodes are doing different things I guess it's just the fact that you have to deal with the network at all. And that's where all of these additional troubles or, or challenges come in.Shubheksha: Yeah, I think, yeah, that's probably a more accurate way to put it. Like, I can't really think of a system that you can like reasonably fit on a single machine right now. especially with the way, like we provision things, in the cloud. So yeah, by that definition, pretty much everything that we are building is a distributed system at some level.Jeremy: Mmm you were saying earlier about how don't build one if you don't need to. But now it almost sounds like everything is one, right?Shubheksha: Yeah. To some level. Yeah. Everything is one. Yeah.Jeremy: When people first start getting started with having a system that involves the network, what are some common mistakes you see people make?Shubheksha: Like a lot of people misunderstand the CAP theorem. The CAP theorem, is basically one of the most popular theorems in distributed systems literature and it states that like CAP stands for consistency, availability and network partitions.And there's a lot of debate around like what those terms actually mean. So I don't want to go into that right now because that'll probably take up the rest of our time. CAP theorem states that you can only have two out of those three things at a given point of time.And a lot of people sort of take that to heart and take it way too literally whereas there's a great blog post about this, which we should definitely link in the show notes, but which argues that like network partitions are not really optional. There's no way you can have a network that's a hundred percent reliable, like that's not humanly possible.And there's also like some confusion around like what the terms, availability and consistency mean, And that's also like a huge source of debate and confusion in the community. And like a lot of people when they, when they sort of start working. They don't really understand what that means in what context, because those terms are very, very overloaded in tech in general.And like one thing means like five different things depending on the context. So yeah, it can be really hard to get started.Jeremy: Mm, and since it's a little difficult to pin down, what CAP means, I guess, for somebody who's, who's starting out. what would you suggest they they focus on to make sure that when they build something, that it works, I guess is one way to put it.Shubheksha: I think that's, that's a very hard question to answer in like a generalized manner. It, it depends on your requirements and like, what is your end goal? What sort of like availability and reliability characteristics that you're looking for, from the system, for example, if it's a, if it's a system that's like sorting some critical data in that case, like consistency would be more valuable compared to variability, like it's a trade off.Like everything else. So it really depends on the use case and yeah, like what you hope to achieve with the system.Jeremy: Do you have a example from your experience where you, you chose those trade offs and could kind of walk us through that process?Shubheksha: Uh, no, I don't think most people would have to deal with this day to day. And if you do, it's probably not a very good sign.Like at Monzo, we had like a pretty unified platform. We did not have to like, think about this, at like, that level and go into it like super deep.Jeremy: If I understand correctly there are engineers who build infrastructure or build tooling so that the software engineers or the backend software engineers don't have to worry about is this going to be eventually consistent or is this being partitioned instead you have some kind of API that's layered on top of that. Is that correct?Shubheksha: Yeah. So you don't have to worry about like partitioning or anything like that. Like that's just taken care of. you just write a database schema, you deploy it and like data gets populated. You're not super worried about how that's done.Jeremy: Hmm. And when you give the example of a database schema, is this like a commercial product or an open source product? Or is this something built in house at Monzo?Shubheksha: Oh no. This is Cassandra.Jeremy: Oh Okay. Then it's the Cassandra database developers. They are the ones who had to think about the CAP theory and work out the trade offs that they wanted to decide on.And you, as the engineer, you just have to know how to use Cassandra.Shubheksha: Yeah. And like we deploy it ourselves as well. So it can get a little tricky there too. You can fine tune it, it has a lot of knobs. So you have to give consideration to that and configure it accordingly basically.Jeremy: I don't know if this applies to Cassandra specifically, but sometimes when people talk about distributed systems they talk about things like eventual consistency of how in your database, the latest change may not always be there when you go out to read it as a, as a backend software engineer. Is that something that you have to to think about and work around?Shubheksha: So that also depends to an extent on the configuration and the type of data. if you know some data might not be present when you're reading it, like you have to guard for it. But like a very common pattern that's used is like, if you don't find something in the database, then you just assume that it's not present at all.So like you try to read by a particular ID and if it's not present, then you create a new one. we don't have to deal with eventual consistency at every step. I would say like, I'm sure Cassandra does something in the background to deal with that. But yeah, like usually the assumption we make is that like, yeah, if it's not found in the database, then you just go and create it.Jeremy: Mm, okay. I guess another thing I would ask about is you have all these different nodes you're referring to in a distributed system, and they all need some way of communicating with each other. what are some, some good protocols or ways for different nodes to communicate?Shubheksha: something that I have been using at Monzo, or like I used to use at Monzo is a protobufs. It is a project that has been released by Google and it basically specifies how different services will communicate over the wire. So that sort of standardizes it and you have to still take care of like, some of the marshaling and unmarshaling.But yeah, like it does most of the, the job on its own because when you're communicating on like, suppose you have two different services deployed on two different machines and machines, and you need a way to make them talk to each other, you know, you basically need some sort of a language that both of them understand.So like if one is speaking French, and if one is speaking German, they can't really talk. So like, protobufs are sort of like both of them are talking in English and they can understand what they're saying.Jeremy: And protobuf is the serialization format. Right. So that's a way that you can get a message from one system to another. But how is the, the transport handled? Like, are you using web services or some kind of message bus? Shubheksha: Oh, yeah. So, that's mostly done over HTTP. Like wire remote procedure calls. Like, that's the most common thing I've seen you basically have some sort of a wrapper around, uh, like go's HTTP primitives to suit the needs of your system.Jeremy: Hmm. Okay. you have HTTP endpoints in each service and you send that service, a protobuf message,Shubheksha: Via a remote procedure call yeah.Jeremy: If you're using HTTP, each node or each service has to know where the other machine is or how to reach the other machines, how is that managed in a in a system? Shubheksha: that's sort of done, by some sort of service discovery mechanism, something that's super commonly used is Consul by HashiCorp. The job of that piece of software is to basically find out what service is deployed where. That's the other challenge with distributed systems, because you have so many things all over the place, you need to sort of keep track and have an inventory of like what is where so that if you get a request for a particular endpoint, you know where to send it.you can use like a service discovery, uh, tool like Consul or you can use something like Envoy, which is a network proxy, and which sort of helps you uh do something similar.Jeremy: And from the, the back engine engineers perspective, if you are using Envoy, um, or Consul or something like that, when you're writing out your end points or which HTTP endpoint you're going to talk to, what does that look like? Is there some central repository you go to and you just put that URL in and Consul or Envoy handles the rest?Shubheksha: Oh, no. So like as soon as you deploy your service, basically, the platform will be aware of it. And all of the manifests that are associated with your service will get deployed and Envoy will pick the changes up.For example, a new service, you need to make your proxy aware of it. So there will be some amount of configuration involved either you can do that by hand, or you can do that in an automated way.Jeremy: So it's, it's almost like, adding your service to the system, making it aware that it exists. rather than having to know exactly who you're talking to, that's all handled by, some kind of infrastructure team or by the product.Shubheksha: Yeah. So all of that, like, all of the platform components are basically deployed by a separate team.Jeremy: Hmm. Okay. If you are outside of that team, a lot of this complexity--Shubheksha: Is hidden away from you, yeah. And I have mixed feelings about that.I feel like as a backend engineer, I would still want to know how things work. And like, part of that is just because I'm a curious person by nature. But like part of it is I, I genuinely think like developers should know where their code is running and how it's running and like the different characteristics of it, like making it too opaque is not really helpful. And like, I think it's needed to be like a holistic all around engineer basically.Jeremy: Yeah, because I wonder how that impacts when there's a problem and you need to debug the problem. How do you determine whether it's something happening on the infrastructure side versus a bug in the code? That sort of thing.Shubheksha: Yeah. So that can be a tricky one. Like you have to go through like multiple layers of investigation to figure out at what level the problem is like you usually start with the code and when, like, and also depending on like your monitoring and alerting, like, if it's something really clear that like a node is down, then you know that yes, it is an infrastructure problem.But if like the error rate is high. Then it is very likely to be a code problem, but yeah, like in some cases it can be very difficult to figure out like, what is actually going on and like, what is the symptom and what is the cause.Jeremy: And in the context of a distributed system, are there specific tools or strategies you would use to try and figure out what's going on?Shubheksha: So this really depends on the system and like how well it's monitored. Like. my main takeaway was that like observability and monitoring should be a first class citizen of the design process, rather than an afterthought that you just add in, after you're done building the entire system, like that goes a long, long way in helping people debug because like, as, as a team grows or as a company grows and as the system grows, it can get so unwieldy that it, it can be super hard to keep track of what's going on and what, what is being added where. That's one of my main, main takeaways that yes, you should always think about observability, right from the start rather than treating it like an afterthought.And in terms of like investigation, I'm not really sure if there are like specific tactics I would use. Like that's just something. Like you watch and learn other people do. And like it's so specific to the technologies and the kind of platform that you're dealing with. But yeah, like the best thing I have found about like incidents and like on call and all of that is that you just, you have to watch a lot of people who are really good at it. Do it again and again, and again, ask them questions and like, see what workflows and mental models they have and like the kind of patterns that they have built over time. And just go from there. It starts to makes sense after a while like, initially it just seems like magic you just keep staring and you're like, how did this person know that, you know, this is where they had to look. And like that's where the bug was. I like to think about it as like, you just have to form mental models. Like the more familiar you are with something the stronger your mental models are. And like you have to update them. with time with changes, all sorts of things and they just stick after a while.Jeremy: Do you have any specific examples of a time you were trying to debug something and some of the steps that you took and trying to resolve it?Shubheksha: Oh yeah, after like I was, I used to be on the, on call rotation at Monzo. So like I was involved in lots of incidents and like, usually we had pretty decent run books, about like, what to do when a spe specific alert went off.So we used to just usually follow the steps and if we were like completely out of our depth, then try a bunch of things. Throw darts randomly and see what sticks. Like sometimes you just don't know what else you can do. Like every outage is not like super straightforward, super simple. You have to like, look around, hit at like five different things and like see where it hurts basically. So yeah.Jeremy: And in terms of the run books you were mentioning, would that be a case where there was a problem previously and then somebody, documented how they fixed the problems, then you would address that the same way next time?Shubheksha: Usually yes, usually yes. Or something like if like something was being tested and there's a very clear process to like, test whether it's working or not. Something like that would be documented as a runbook as well.Jeremy: Can you give us a little bit of a picture of specifics in terms of, are you SSH going into machines or are you looking at a bunch of log files? What, what is, uh, that troubleshooting process often look like?Shubheksha: I'm not sure if I'm supposed to talk about this, especially because I'm not at Monzo anymore, so yeah, I'd be a little cautious talking about that right now.Jeremy: Okay. That's fair. You had mentioned observability being really important before. what are some examples of things that you can add to your services or add to your infrastructure to make this debugging process a lot easier?Shubheksha: So one of the main things that, uh, I think work great in practice is error budgets. Having some sort of definition of this is what is acceptable in terms of a particular service or a particular system returning errors. And like, if it crosses that threshold, then it's like, yes, this is our priority. We need to fix it. Like a lot of the SRE (Site Reliability Engineering) work can not be postponed indefinitely, which is what happens in a lot of companies. That are trying to do SRE but badly. So, yeah, so like that's something that I really like as a philosophy. in terms of like monitoring, I think what I'm trying to get at, when I say that like monitoring should be a first class citizen, is that you should be very clear about the output that you expect from the system or what does healthy, not healthy look like for your system, because if you have that, then you can derive metrics from it and vice versa. If you're not able to come up with good metrics for your system, then have you looked closely enough at like what the system is trying to achieve?Jeremy: Yeah. I think that idea of error budgets is really interesting because I think a common problem people have is they'll log everything and there will be errors that happen, but people just expect it right? They go, Oh, it's flashing warnings, but don't worry about it. It normalizes it, but nobody really knows when it's an actual problem.Shubheksha: Yup. Yeah. Like, yeah. So basically yeah if you're logging everything and there are errors that are expected errors then yeah how do you differentiate whether an error is an expected error or it's an unexpected error like that's just a slippery slope that you don't really want to tread on.And people get used to it. Like if a service keeps spewing errors and nobody really cares then if there's an actual outage, it's very easy to dismiss that yeah. that's what this service does and we have seen it before and it wasn't a problem. So yeah. It's probably fine.Jeremy: Like you were saying, it's observing what you expect the system to do. So I guess, would that just be looking at. a normal day and looking at your error rate and seeing if people are able to successfully complete whatever they're trying to do on the system. and then if you actually start receiving, I guess, trouble tickets or you receive complaints from customers, that's when you can determine oh this error rate being this percentage and people complaining at this time means that this is probably a real issue. Something like that?Shubheksha: Yeah, that's a very tricky question to answer because like, yeah, it's, it's hard to know what the right error budget is. Like, that's a much harder question to answer then, you know, just having an error budget. So yeah, like initially, like it it's just playing around and seeing like what sort of throughput the system has, what's the expected load and like all of those things and coming up with some metric and like tweaking it over time as you learn more about the system.Jeremy: And with these systems, I'm assuming there are a lot of different nodes, a lot of different services. How do you debug or kind of work with this sort of system on your own local machine? Like, are you spinning up all these different services and having them talk or do you have some kind of separate testing environment? What does that look like?Shubheksha: Yeah. Usually there is a separate testing or staging environment, which tries to mimic the production environment. And on your local computer, you can usually spin up a bunch of containers. Obviously it's nowhere close to like having an actual copy of the production environment, but for simple enough changes it can work.But usually there are like testing environment, which will give you most of the same capabilities as a production environment, where you can test changes, but it's like the other problem is keeping them in sync. That's also a really, really hard problem. So a lot of times like staging environments exist, but like even when you're deploying to production after testing and staging, it's like fingers crossed.I don't really know what's going to happen. We'll just deploy and see what breaks. Because yeah, the environments can divert quite a bit.Jeremy: In terms of the staging environment, would that be where you have a bunch of fake data that's similar to production and then you have tooling to have transactions that would normally occur in production and staging?Shubheksha: So like mimicking stuff as much as possible to whatever extent. Yeah.Jeremy: How would engineers coordinate if you have this staging environment, I don't imagine everybody has their own.Shubheksha: Yeah. That can get that can get tricky as well, because like people can override changes and like if they they're testing at the same time and they don't know, and like they're deploying on the same, the same service, so that's one benefit of microservices that like, if your service is small enough that multiple people will not be working on it at the same time, then you sort of reduce that contention.But yeah, like if there are multiple people who are working on the same service, then they have to coordinate somehow to just figure out who's using the environment when and like testing stuff. So the other thing is like having some, a small subset of the staging environment given to like every single engineer, but like, yeah, that's, that's obviously not very simple to achieve, especially if you have like an engineering organization that is growing quite a bit.Jeremy: How large was the engineering team at Monzo?Shubheksha: I think it was about 150 engineers.Jeremy: Hmm, 150. Okay. So in your experience, when you were working on a service, were you able to work on it in isolation where you would work out what the contract should be between services. So you didn't actually need to talk to the whole system for a lot of the time? Or what did that look like?Shubheksha: I think most of the time it was like a single person working on a single service. And yeah, if you do, need to work on the same service with someone else, and usually it was, it was never more than two people in my experience then yeah you just, you coordinate with them and just give them a heads up that yes, you're doing this and you'll be deploying your changes. Jeremy: And your service most likely is talking to other services. So would that be a part of this design process where you work out what the contract should be and that sort of thing?Shubheksha: Ah, so usually that was pretty straightforward. Like it was a simple RPC. So you do not have to think about it too much. Like you just know that service A will talk to service B to fetch some data XYZ. Jeremy: So I guess before you started working on the service, you might provide to the other team, or whoever owns that other service here is the protobuf message that I intend to receive and that sort of thing.Shubheksha: Yes. So mostly like if like a service A talking to service B. Service B will have its own handlers and like its own, proto files and all of that. And I just reuse that in my service to talk to service A.Jeremy: And then likewise, if somebody was talking to your service, you would provide them your proto files or schema.Shubheksha: Yeah.Jeremy: Another thing about having all these different services is I would imagine that they are all being updated at different times.What are some of the challenges or things that you have to look out for when you're updating services in a distributed system?Shubheksha: The biggest problem is dependencies what depends on what, like the. Dependency graph can be pretty difficult to sort of map out, especially if you have like a really large sprawling system with like dependencies all over the place. If you're deploying, like 4 services, let's say, and like A depends on B, which depends on C. That's a problem in any distributed system. Like if you have like especially something modular, like just tracking the dependencies in that entire system can be a huge task. And like, it can inform a lot of other decisions, like in what order do you deploy things?And like, if you want to remove something, like, it just leads to this rabbit hole because like, yeah, if you want to delete a service, you need to first figured out, like if it's if there's anything else that's depending on it, otherwise a bunch of things will break and you won't really realize that.Jeremy: And how do you, keep track of that. Is there some kind of chart or some kind of document? Like how do people know what they can change?Shubheksha: A lot of it is just in people's heads. Like you talk to people who have worked on that part that you're working on. And like, they'll usually be pretty familiar with like the sort of dependencies that exist. I think. We tried to like map out all of the dependencies at once and like someone posted about this on Twitter as well.But like, they actually tried to create a dependency graph of like all the services that we had and yeah, it was not a pretty picture,Jeremy: Hmm. Because how many services are we talking about here? Is it like hundredsShubheksha: 1500Jeremy: 1500 plus. Wow. Shubheksha: Yep. Jeremy: That's pretty wild because basically you're relying on maybe going into email or chat and going like, Hey, I'm going to update this service. Uh, which ones might this break?Shubheksha: Yeah. So that can get tricky and that does make like onboarding people a little bit harder. Like there are there are trade offs, both good and bad ones when you have a system like that. It's not all bad. It's not all good.Jeremy: 1500 is-- it seems like nobody at the company could even really know exactly what's happening at that scale. I wonder from your perspective, how do you decide when something should be split off you know into an individual service versus having a service take care of more things?Shubheksha: Yeah, so that was usually pretty scoped, We try to let a service do one thing and do it well, rather than trying to like, put a bunch of functionality into the same thing. So that, that was usually the rule of thumb that we followed for every new service that we designed.Jeremy: Hm. Cause it seems like a bit of a trade off in that. If there's a problem with that one service, then you have a very localized place on where to work on. But on the other hand, like you said, if you have this dependency graph, you might need to step through 10, 20, 30 services to find the root problem.Shubheksha: Yeah, there were definitely code paths that can get that big and that long. So yeah.Jeremy: In traditional systems, we think of concepts like transactions with a relational database. For example, you may want to update a bunch of different records and have that all complete as one action. When, when you're talking about a distributed system and you need to talk to 5, 10, 20, however many services, how do transactions work in distributed systems?Shubheksha: I'm very tempted to say they don't (laughs). Uh, so yeah, so this is, a really really interesting field within itself, like within distributed systems itself and transactions like distributed transactions are really really hard, like transactions themselves are super hard, even on a single machine. And then you like add all of this complexity and you're just like, yeah galaxy brain (laughs) . I'm not super familiar with this topic, but I've read a bit because it's just like super fascinating. And like, usually you try to offload all of that. One way you can do it is via a managed service. You don't have to care like where your database is running, how it's running, you just use APIs, you throw stuff at it and you know, it's going to be stored.There's a bunch of fine tuning and configuration you can do yes. But yeah, you don't know how to think about like the nitty gritty of it. Distributed transactions, like there's also different definitions for it. Like what do you mean? I mean, when you say a distributed transaction, like, a transaction, that's executing on multiple machines or transaction, that's like gathering data from multiple machines before returning it to you? A transaction that's I don't know, halted to be executed on different machines? So yeah, it can get really complicated.Jeremy: Yeah, I'm thinking from the perspective of Monzo is a bank, right? So, you might pull money out of an account and transfer it to somebody else like a very basic example. And maybe as a part of that process, you need to talk to 10 different services and let's say on the sixth service call, it fails, but those previous five have already executed whatever they're going to execute.Shubheksha: Ah, right. I see what you mean. Do we roll all of that back? Or like, do we, yeah, so like we did not roll stuff back. But yeah, like you have to account for that in your error handling logic that what happens if it fails at this step and like XYZ has already happened?Jeremy: Could you give like an example of one way you might handle something like that?Shubheksha: Uh, let me think. I'm not sure. I possibly dealt with a situation like that because like all of the tables were scoped to the service that you were working on. So like nobody else, like no other service can access the tables that you had within your service. So that's simplified a lot of things. And usually there was a lot of correlation by IDs.So like if you're generating flake IDs in one service and it fails and it doesn't generate an id, then it will not be found. And you know something has gone wrong. So you basically just like log and bail and stop proceeding. But obviously around the parts that move money we had a lot more robust error handling around that. Yeah.Jeremy: Yeah. So I guess that's one thing that could help you track, as you were saying, having an ID that passes from service to service. So,Shubheksha: So that's usually accomplished by go's context package. So you have some sort of a trace ID that's like passed through the lifetime of a request all the way down. Uh, so you know, that yes, like all of those calls were part of the same request.Jeremy: Hmm. Okay. And then that would mean that you could go to some kind of logging or monitoring service and actually see, like, these are the five services that we're trying to handle it, and this was the results. I guess there's like two parts there's there's figuring out which services and which calls were part of the same transaction.And then like you were saying earlier if something goes wrong, how do we correct for it? And yeah, that sounds like a very, very challenging problem (laughs).Shubheksha: Yeah. Yeah. Like one of the main problems with distributed systems is when things go wrong, you don't know what's going to break where. Like a lot of things break and then you just have to like start searching for, okay, what has gone wrong, where, and like something completely different in a completely different part of the system might just be like breaking something else in a completely different part of the system. And like, as your system grows, it's impossible to keep track of all of it in your head, especially like a single person cannot do that.So it can just be really hard to like, have a big picture view of the system and figure out what's going on, whereJeremy: Mm, and it sounded like in your experience that that knowledge of what was going, where and how things interacted. A lot of that was sort of tribal in a way. Like there were people who, who knew how the pieces fit together. Um, but it's tough to really have this document or this source of truth.Shubheksha: YeahJeremy: Yeah, I can imagine that being like you were saying earlier, very difficult in terms of onboarding. Yeah.Shubheksha: Yep.Jeremy: I guess another thing about when you have distributed systems and you have all these different services. Presumably one of the benefits is so that you can scale up certain parts of the system.I wonder from your perspective, how do you kind of figure out which parts of the systems need to be scaled up and what are some strategies for, if you're really hitting barriers, a performance in a certain part of the system, how you deal with that?Shubheksha: Usually metrics. If you're constantly getting alerts that yes, like a particular node or like a particular part on a particular node is constantly hitting its memory and CPU usage, or, you know, you're going to be performing some sort of like heavy workload on a particular service and you scale it up preemptively or like an alert fires, and then you scale it up.So that's like a very manual way to go about it. Uh, and in terms of performance constraints. So a lot of that was handled by our infrastructure team and I personally did not work on like fine tuning uh performance as such on any of the parts of the system that I worked on. But like what I saw other people doing, was that like, you have to go on chasing, profiling, and figuring out where that bottleneck is.And like, sometimes it can just be your bug. That's like, like you're leaking goroutines or something like that, or like you're leaking memory somewhere else. So stuff like that, like usually you just, you have to keep profiling till you figure what's going on and then fix it.Jeremy: And for you as the person who's building the services and handing them off to the infrastructure team, was there any contract or specification written up where you said like, Hey, my service is going to do this. And I think it's going to be able to process this many messages or this many transactions a second.And then that's how the infrastructure team can figure out how much to scale or how did that work?Shubheksha: Yeah. So there was some amount of load testing that was done for like services that were really, really high throughput with the infrastructure team so that they like were in the loop and like they knew what was going on and they can scale up preemptively.Jeremy: In your experience as a backend engineer, we've talked about the infrastructure team. Are there any other teams that you would find yourself working on a regular basis?Shubheksha: Not really, no, it was mostly backend and and infrastructure. And in some cases security as well. If you needed like a particular like their input on something particular that we were working with.Jeremy: Would they provide some kind of audit to tell you, like, this is where potential issues might be or?Shubheksha: Yeah. Like something like that, or like, if you just need their input on like a particular security strategy, like if you wanted to like store some, critical data or something like that, how do we do that? Or like, what would be the best way to go about it so stuff like that.Jeremy: During your time at Monzo, you were writing systems in, in Go What do you think makes Go a good choice for distributed systems or for the type of work you were doing in general?Shubheksha: One of the main things that I like about it is that it's super easy to pick up. This was my first job with Go like first, full time job. I was learning Go a little before that, but I was able to pick up very, very quickly. I think what attracted people to Go like, this was definitely a huge part of it.Then the fact that it was backed by Google and, and it wasn't going to like just vanish overnight was also really helpful. And I think, it really started from docker and docker sort of shot into the limelight and it sort of made it the defacto language of cloud native infrastructure and it just kept catching on.And then Kubernetes came along and then because Kubernetes was written in Go, everything else started to be written in Go. It has a lot of neat features that I think make it uh really really, easy to sort of start building stuff with it. One of them is definitely like how good the standard libraries are and how easy it is to sort of build on top of them.And like first-class support for concurrency, which can be a huge, huge pain to sort of implement on its own and like a lot of languages don't support those primitives out of the box. Like it's really good for those sorts of use cases as well. And yeah, it's just easy and fun to learn.Jeremy: So basically it was easy for you to step in as a new employee, and start learning how services work just because the language was easy to pick up. um, and yeah, that, that built in support for concurrency. I wonder, are there any things compared to your experience with other languages where you, that you didn't like about go.Shubheksha: Mmm, I think, yeah, this is, this is like a very common and controversial opinion, uh, in the go community, but like sometimes it can feel very repetitive like, especially the error handling bits where you're like constantly checking for errors. I can appreciate explicit error handlings but like, yeah, sometimes it can just feel like you're writing the same block of code, like 50 times in the same file.And that can be a little bit annoying. It's super verbose. Like it just, verbose, not in the Java sense, but more in the, you should know what you're doing sense, like, it tries to be as simple as possible rather than being as clever as possible and making things harder to understand Jeremy: And the the error example you were giving, I'm assuming that's because go doesn't have exceptions. Is that right?Shubheksha: Yes, Go does not have a concept of exceptions at all. you're supposed to do explicitly check for errors every single step and then decide what you're going to do. So in a way, it sort of makes you think harder about like the ways in which your code fails and what you're supposed to do after that, which is a good thing.Jeremy: Right in some cases you have a very deeply nested call stack, And you might be accustomed to throwing an exception deep and at the, outer level, just handling that. Um, but with Go at every single level of the stack. You would need to check the error.What do I want to return or, yeah okay.Shubheksha: So you have to bubble the error up. You need to make sure you're bubbling the right error up. You have to augment it with all of the information and the metadata that you need, which will help you debug the problem no matter at what layer it occurred at so that can definitely be sometimes tricky.Jeremy: Yeah, I can definitely see, um, I guess why it'd be good and bad. Maybe we should move on to you recently wrote a post about the lessons you learned in your third year as being a software engineer. I thought we could go through a few of those points and maybe, get into where you're coming from with those. I think the first point you had mentioned was just doing things like working on projects and doing the work isn't enough. You have to know how to, how to market it as well.Shubheksha: Yeah.Jeremy: You had talked about marketing to the right people. are you talking about people who are internal at your company or outside and, and how do you find what you consider to be the right people?Shubheksha: Oh, that's a really good point. I think it's a bit of both like internally in terms of like progression and like the stakeholders of your project and all of the other people that you work on on a day to day basis. And externally just to build a network and like, let people know what you're working on and what you're learning.Like I'm a huge fan of like learning in public in general. That's just something that brings a lot of good things your way if you do it right.Jeremy: Things like blog posts or conference talks, things like that. Another thing you mentioned was how, how titles matter. And I know that there's a lot of discussion where people say like, Oh, it doesn't really matter. But, um, you were saying like, they really do matter. And. from your perspective, how dod the roles of, of titles change from, from company to company?Shubheksha: I think it. It really depends on the size of the company as well. Uh, that's one of the main factors, like how well established the company is, how many engineers do they have? Like how much do they like place weight on the title internally? Like there are companies where like there are meetings, which are exclusively for people who have a certain level or a certain title.So in that case, like, it's very obvious that if you don't get promoted then you get you're stuck, you're losing out because you're literally not allowed to attend. A certain set of meetings that you would probably benefit from. And in, in like in other ways it can also like halt your progression. And when I say titles matter, I don't say that.Like, just because you're a senior engineer, you know, more I don't believe in like, just number of, years of experience, it's about the quality of the work that you have done more than like just the sheer amount of time that you've put in. But. In terms of like how people perceive you, what they expect from you and what they assume about you definitely shifts when you have like a title attached to your name.Jeremy: And in your career or just in people's careers in general, how much, weight or priority should people put into making sure that they, they have like a. impressive title or a high title. especially when you think about, you were saying how it changes between the size of the organization.Um, somebody who's a CTO at a three person company it's not the same as at a 5,000 person company. I wonder what your thoughts are on that.Shubheksha: Uh, yeah, that's a really good question. I think like what I mentioned in the next point is that it, it's very easy to get caught up in that, in the climb. Like when it comes to career ladders, but that can also be very detrimental and that relates directly to the fact that it's not just years of experience, it's actually the quality of those years of experience that actually counts towards like how good you are.If you, if you run after two months, it can just be like a second full time job. It can completely drain you. You'll stop learning, you stop actually doing what you enjoy. And you'll just blindly chase a title. Like that's not a good position to be in because like, you have to balance it out. At the same time if you feel like you're just getting stuck, you're not getting promoted. You're not getting to the level you want to be at. It might just be time to like, change. And see what's out there, and if there's someone who will treat you better, but like, it really depends on what you value more because a title change can bring you better job prospects, especially if it's a change from like just software engineer to senior software engineer, then you just have to much wider pool of roles available to you. So it really depends on the stage of the career you're at. And what do you value personally?Jeremy: In your posts, you had talked about chasing promotions, What are some of the things that you might find yourself doing that are more going for the promotion rather than growing yourself? technically or, you know, skill wise?Shubheksha: That depends on what your company values and what do they base promotions on? So like if they have a very rigid structure where it's like a bunch of check boxes that you have to tick, then a lot of people will try to game that system. And they'll just try to do things maybe at the expense of their growth, as well as the expense of what's good for the company just to get ahead and get promoted.So that's not beneficial for the employee or the company. But yeah, like a lot of, frameworks that are like of that fashion, where you have to check a bunch of boxes in order to get promoted often like end up in people doing things that they are being incentivized to, but in a very wrong way,Jeremy: Right. So I guess it's looking at what your organization's process is and figuring out is it, you know, some of it is not the stuff you want to do, but you just kind of get it done and you can move on or whether the process it's so, difficult.And, and so, um, um broken, yeah, basically, uh, that it's worth just moving on to another, another organization. Your, your next point, I think was about, uh, how sponsors are so important. And I wonder if you could explain a little bit about what you mean by a sponsor versus, a mentor or somebody else?Shubheksha: Yeah. So the difference between uh sponsors and mentors is that I think mentors, the way I think about it at least is that mentors are very passive in the way they help you vs sponsors are very active. Like mentors will be there for you. They will help you. You can go ask them questions, whereas with sponsors, they will actively put your name in the hat for opportunities help you connect with people and like send opportunities that they think are a good fit your way, and they will be your advocate.Rather than you having to ask your mentor to be your advocate. So I think that's, that's the main difference. And like, there's also the question of sort of like the sponsor's reputation being at stake in a certain sense that they, they basically take a chance on you and they trust you enough to put their reputation on the line in order to do right and good by you, which is not the case with mentors.Jeremy: It sounds like a sponsor would be someone who is vouching for you, right. Or is giving you recommendations. Shubheksha: Yes, essentially. Yeah. Someone who watches for you in like rooms either you're not in, is familiar with your work, actively advocates for it.Jeremy: Hmm. And that could be someone within your organization, for example, saying, I think that you would be a great fit for this project and telling other managers about that, or it could even be someone outside of your work, just saying if you're looking for a position or something, they might say, Oh, I know somebody who would be a great fit for this conference talk or this job, that, that sort of thing.Shubheksha: Precisely. Yeah.Jeremy: Do you have any specific advice for how people should.. I don't know if seek out is the right word but to build up a network of sponsors,Shubheksha: I get that a lot. And I have a blog post in the works for it, which I'm trying to think about, but like, it's really hard to put in words, like it's just like a lot of networking and like reaching out to people and like slowly and steadily sort of surrounding yourself with people that you can admire and look up to and who will be like willing to vouch for you.So like there's no direct, straight method to do it. I can't prescribe like if you follow step one, two, three, that, yes, you'll definitely have it. I think it's just, it just takes time on a lot of effort and patience, because I remember at like, when I started, like, I was desperately looking for mentors and I was just like reaching out to people and I just felt like I could do so much if I just needed the guidance and it was really really hard to find.So yeah, like I completely empathize with like people who are struggling with it at the moment. And it's, it's really hard. A lot of people are like super busy. They don't have time. And so it's it can just feel like bad to sort of ask for their time as well. So, yeah, that's also something I definitely struggle with, but like don't have concrete steps, at the moment, will, hopefully have something to say and publish it in a blog post.Jeremy: Yeah. I mean, I wonder from your personal experience, I remember a few years ago you went on a few podcasts to talk about your experience, getting into open source and things like that.You know, looking back on the process you went through, is that a path that you would recommend to other people, or do you have thoughts on how you might have approached that differently now?Shubheksha: I think I got very lucky at like some steps. Like I did not set out I did not plan it out uh that way it just sort of happened. And yeah. So like, I don't think I would take a different path. Like I like along the way, especially via open source like I met so many great people who have been amazing mentors and sponsors for me, but like I did not seek them out.The, the other thing that I realized was like, you can't meet people and be like, Oh, you're my mentor. Like, that's a relationship. That's built over time as you learn from each other. And the other thing is, yeah, it's a two way relationship. It's not just you leeching off the other person and like trying to, you know, like all, like take away all of their knowledge and like sort of embedded it in your brain.That's not how it works. Like even if you have less knowledge compared to the other person, they can still learn something from you and like, it's always good to have that mindset rather than, you know, the hero mindset where you're like, Oh my God, this person is, is God. And like, I want to learn everything from them like that doesn't help.Like placing people on a pedestal, basically. Like it eases the pressure on them as well. And it helps you be more comfortable too. So like a lot of my mentor, like sponsor relationships are like that where we have a bunch of like mutual interests and we talk about it and we learn from each other. So like a lot of them are like, not even like formalized, like we never, like had a conversation where we were like, Oh, like, we are like a mentor and a mentee now, like a sponsor and a sponsee now, like yeah, you just, you sort of build those kind of bonds with people and where they feel comfortable watching for you.And you just take it from there.Jeremy: That makes a lot of sense just because relationships in general are very much, it's not you meet somebody and you go, we are going to be friends, right? Shubheksha: It just takes time. Yeah.Jeremy: Yeah. So I can see how it'd be too difficult to pin down what are the steps you should take? I'm looking forward to the blog posts though.Shubheksha: Yeah. Thank you.Jeremy: I think the last thing you had mentioned in the post was you were talking about how programming got easier over time. And I wonder, were there any specific milestones you remember of things that were really hard and then it, all of a sudden, it just clicked over the yearsShubheksha: I think uh one of the main things I remember was like, just on designing containers and how they work. Like it was literally like a light bulb moment with where like I just, I just felt like, yes, I finally understand what it is. And like, there have been times over the years where I have felt like that without like actively trying to something that I've experienced very frequently is like, I learn about something.Like say right now and I'll understand maybe like 40% of it. And then I'll come back and revisit it, like from another topic or like an adjacent topic. And I finally understand it and maybe I've been like working with it or around it in the intervening months, but like, it's not like I sat down and I tried to like, learn about it or anything like that.But like when you're understanding builds up. During the intervening time, you sort of realize that you've not only learned the thing that you were actually learning, but a lot of adjacent concepts also makes sense. And it can be very hard to have this perspective when you're starting out because you're just like, none of this makes sense, like what is going on.But yeah, it, like your understanding sort of grows and compounds in a similar way to money, I would say. And it's very hard to remember that when you're just starting out.Jeremy: Yeah, I can definitely relate to that where I think it's, it's very difficult. And I think a lot of people want to be able to sit down and read a book or go through a course and go like, okay, I now know this thing. Uh, but I have the same experience with you where, uh, it's just working on problems that are not even hitting.Always the main thing, but it's related to it and all of a sudden it might suddenly click. Yeah.Shubheksha: Yeah.Jeremy: Cool. Well, I think that's a good place to wrap up, but are there any other things you, uh, you'd like to mention or think we should have talked about.Shubheksha: No, I think we covered a lot of ground and it was really fun chat thank you so much.Jeremy: Cool. where can people follow you online to see what you're up to and what you're you're going to be doing next?Shubheksha: they can follow me on Twitter. Uh, that's where I'm most active. Uh, my handle is scribblingon, I have a website which is shubheksha dot com and that's where I post all of my writing.Jeremy: Cool. Shubheksha, thank you so much for joining me today.Shubheksha: Thank you so much.
undefined
Dec 17, 2020 • 1h 9min

React Authentication with Ryan Chenkie

Ryan is the author of Advanced React Security Patterns and Securing Angular Applications. He picked up his authentication expertise working at Auth0. Currently, he's a GraphQL developer advocate at Prisma. Related Links: @ryanchenkieReact SecurityStop using JWTs for SessionsWhat are cookies and sessions?Learn how HTTP Cookies workJSON Web Token IntroductionRefresh Tokens: When to Use Them and How They Interact with JWTsCritical Vulnerabilities in JSON Web Token LibrariesRS256 vs HS256: What's the difference? Music by Crystal Cola. Transcript You can help edit this transcript on GitHub. Jeremy: [00:00:00] Today, I'm talking to Ryan Chenkie. He used to work for Auth0 and he has a new course out called advanced react security patterns. I think authentication is something that often doesn't get included in tutorials, or if it's in there, there's a disclaimer that says, this isn't how you do it in production. So I thought it would be good to have Ryan on and maybe walk us through what are the different options for authentication and, how do we do it right? Thanks for joining me today, Ryan. Ryan: [00:00:30] Yeah, glad to be here. Thanks for inviting me. Jeremy: [00:00:33] When I first started doing development, my experience was with server rendered applications. like building an app with rails or something like that. Ryan: [00:00:42] Yep. Jeremy: [00:00:42] And the type of authentication I was familiar with there was based on tokens and cookies and sessions. And I wonder if for people who aren't familiar, if you could just walk us through at sort of a basic level, how that approach works. Ryan: [00:01:01] Sure. Yeah. so for those who have used the internet for, for a long time, the web, I should say for a long time, And who might be familiar with like the typical round trip application, which, which many of these still exist. A round trip application is one where every move you make through the application or the website is a trip to the server that's responsible for rendering the HTML for a page for you to see and on the server things are done. Data is looked up and HTML is constructed which is then sent back to the clients to display on the page. That approach is in contrast to what we see quite often these days, which is like the single page application experience, where if we're writing a react or an angular or Vue application it's not always the case that we're doing full single page, but in a lot of cases, we're doing full single page applications where all of the JavaScript and HTML and CSS needed for the thing gets downloaded initially, or maybe in chunks as movements are made. But let's say initially is downloaded and then interacting with data from the server is happening via XHR requests that that go to the server and you get some data back. So it's different than the typical round trip approach, In the roundtrip approach. And what's historically done, what is still very common to do and it's still a very legit way to do auth is you'll have your user login. So username and password go into the box. They hit submit. And if everything checks out, you'll have a cookie be sent back to the browser, which lines up via some kind of ID with a session that gets created on the server. And a session is an in memory kind of piece of data. It could be in memory on the server. It can be, in some kind of like store, like a Redis store, for example, some kind of key value store, could be any of those things. And the session just points to a user record, or it holds some user data or something. And it's purpose is that when subsequent requests go to the server, maybe for new data or for a new page or whatever, that cookie that was sent back when the user initially logged in is automatically going to go to the server. That's just by virtue of how browsers work cookies go to the place from which they came automatically. That is how the browser behaves. The cookie will end up at the server. It will try to line itself up with a session that exists on the server. And if that is there then the user can be considered authenticated and they can get to the page they're looking for or get the data, the data that they're looking for. That's the typical setup and it's still very common to do. It's a very valid approach. Even with single page applications. That's still a super valid approach and some people recommend that that's what you do, but there are other approaches too these days. There are other ways to accomplish this kind of authentication thing that we need to do, which I suspect maybe you want to get into next, but you tell me if we want to, if we want to go there. Jeremy: [00:04:03] You've mentioned how a lot of sites still use this cookie approach, this session approach, and yet I've noticed when you try to look up tutorials for working on react applications or for SPAs and things like that. I've actually found it uncommon, at least, it seems like usually people are talking about JSON web tokens that are not the cookie and session approach. I wonder, if you had some thoughts on, on why that was. Ryan: [00:04:37] Yeah, it's an interesting question. And I've thought a lot about this. I think what it comes down to is that especially for front end developers who might not be super interested in the backends, or maybe they're not concerned with it. Maybe they're not working on it, but they need to get some interaction with a backend going. It's a little bit of a showstopper, perhaps, maybe not a showstopper. It's a hindrance. There are road blocks put in place if you start to introduce the concept of cookies and sessions because it then necessitates that they delve a little bit deep into the server. I think for front end developers who want to make UIs and want to make things look nice on the clients but they need some kind of way to do authentication. If you go with the JSON web token path, it can be a lot easier to get done what you need to get done. If you use JSON web tokens maybe you're using some third party API or something. And they've got a way for you to get a token, to get access to the resources there. It becomes really simple for you to retrieve that token based on some kind of authentication flow, and then send it back to that API on requests that you make from your application and all that's really needed there is for you to modify the way that requests go out. Maybe you're using fetch or axios or whatever you just need to modify those requests such that you've got a specific header going out in those requests. So it's not that bad of a time, but if you're dealing with cookies and sessions, then you've got to get into session management on the server. You've got to really get into the server concepts in some way if you're doing cookies and sessions. I think it's just more attractive to use something like JSON web tokens, even though when I think when you get into, like, if you look at companies that have apps in production and, and like, especially like with organizations that have lots of applications, maybe internally, for example. And they're probably doing single sign-on. Maybe they've got tons of React applications, but if they're doing single sign-on, especially there's going to be some kind of cookie and session story going on there. So yeah, I think all that to say ease of use is probably a good reason why in tutorials, you don't see cookies and sessions, all that often. Jeremy: [00:06:51] It's interesting that you bring up ease of use because, when you look at tutorials, a lot of times, it just makes this assumption that you're going to get this JSON web token. You're going to send it to the API to be authenticated. But there's a lot of other pieces missing. I feel like to to make an actual application that's secure and going through your course for example, there's the concept of things like, the fact that with a JWT, a JSON web token, you can't, invalidate a token. Like you can't force this token to no longer be used because it got stolen or something like that. Or, there's strategies that you discuss. Like you have this concept of a refresh token alongside the JSON web token and going through the course and, and I'm looking at this and I'm going like, wow, this, this actually seems like pretty complicated. This seems like I have to worry about more things than, just holding onto a cookie. Ryan: [00:07:50] Totally. Yup. Yeah, I think that's, that's exactly right. I think that's one of the selling features of JSON web tokens is especially if you're coming from maybe like maybe if you're newer to the concept, let's say one of the compelling reasons to use them is that they're pretty easy to get started with for, for all the reasons that I went into just a second ago. But once you get to the point where you've got to tighten up security for your application, you need to make sure that users aren't going to have these long lived tokens and be able to access your APIs indefinitely. And you want to put other guards in place. It requires that you do a bit of gymnastics to create all these things to protect yourself, that cookies and sessions can just do for you if you use this battle-tested well trotted method of authentication and authorization and that's one of the big arguments against JSON web tokens. There's this really commonly cited article, which if I'm doing training, I'll drop, in any session that I'm doing, I can probably get you the link, but it's joepie. I think I'm going to Google that joepie, JWT. That's what I always Google when I need to find this. The title is Stop using JWTs for Sessions. So this guy. Sven Sven Slootweg, I think is his name. He, he's got a great article, a great series of articles actually about why it is not a great idea to use JSON web tokens. And I think there's a lot of validity to what he's saying. It essentially boils down to this-- that by the time you get to the point where you've got a very secure system with JSON web tokens. you will have reinvented the wheel with cookies and sessions and without a whole lot of benefit. I think he's making the case for this in situations where let's say you've got like a single page react application and you've got an API that is controlled by you that is your own first party API. That's just responsible for surfacing data for your application in those situations. You might think you need JSON web tokens, but you would be able to do a lot with cookies and sessions just fine in those situations. Where JSON web tokens have value I think is when you have different services, in different spots, on different domains that you can't send your cookies to because they're on different domains. Then JSON web tokens can make sense. At that point, you're also introducing the fact that you need to share your secrets for your tokens amongst those different services. Other third-party perhaps services or domains that you don't control would need to know something about your tokens. I don't know if I've ever seen that actually in practice where you're sharing your secret, but in any case you would need to, to be able to validate your tokens at those APIs. And so that becomes a case for why JWTs makes sense because cross domain requests, you won't be able to use cookies for that. So there's trade-offs but ultimately if you've got an application front end app and your own API, I think cookies and sessions make a lot of sense. Jeremy: [00:10:50] I think that for myself and probably for a lot of people, there is no clear I guess you could say decision tree or how do you choose which of the two to use? And I wonder if you had some thoughts on, on how you make that decision. Ryan: [00:11:05] Yeah. Yeah. It's interesting. It's certainly a whole plethora of trade offs that you would need to calculate in some way. It's one of the things that I see people frustrated with the most, I think, especially when you see articles like this, where they say stop using JWTs. The most common refrain that I've seen in response to these articles is that it is never explained *what* we should do then. Like there's always these articles that tell you not to do something but they never offer any guidance about what you should do. And this article that I pointed out by Sven he says that you should just use cookies and sessions. So he does offer something else you can use. But I think what it comes down to is you need to, I think first ask yourself, what's my architecture look like. Do I have one or two or a couple or whatever, front end applications backed by an API? Do I have mobile applications in the mix? Maybe communicating with that same API and where's my API hosted is it under the same domain or different domain? Another thing is like, do I want to use a third party provider? If you're rolling your own authentication, then sure. Cookies and sessions would be sensible, but if you're going to go with something like, Auth0 or Okta or another one of these identity providers that lets you plug in authentication to your application, they almost all rely on OAuth, which means you are going to get a JSON web token or a different kind of access token. It doesn't necessarily need to be a JSON web token, but you're going to get some kind of token and instead of a cookie and session, because they are providing you an authentication artifact that is generated at their domain. So they can't share a domain with you, meaning they can't do cookies and sessions with you. So the decisions then come down to like, whether you want to use third party services, whether you want to have support multiple domains to, to access data from you know, it would be things like, are you comfortable with, your sessions potentially being ephemeral? Meaning like, if you're going to go with cookies and sessions and you are deployed to something that might wipe out your sessions on like a redeploy or something, I would never recommend that you do that. I would always recommend that you use like a redis key value store or something similar to keep your sessions. You would never want to have in-memory sessions on your server because these days with cloud deployments, if you redeploy your site, all your sessions will be wiped. Your users will need to log in again, but it comes down to how comfortable are you with, with that fact, and then on the flip side, it's like, how comfortable are you with having to do some gymnastics with, with your JSON web tokens, to invalidate them potentially. One thing that comes up a lot when I'm with the applications I've seen that are using JSON web tokens is somebody leaves the company. Or somebody storms out in the middle of the day cause they're upset about something and they've been fired or whatever, but they have a JSON web token that is still valid. The expiry hasn't expired yet. So theoretically, they could go home or something. If they have access to the same thing on their home computer, or maybe they're using a laptop of their own. Anyway, if they still had access to that token, they could still do some damage if they're pissed off right. Something like that. And, unless you put those measures in place ahead of time, you can't invalidate that token. You could change up the secret key that validates that token on your server, but now all your users are logged out, everyone has to log back in. So yeah, man, it just comes down to like a giant set of trade-offs and the tricky part about it... I think a lot of the reason that people get frustrated is just that all of these little details are hard to know ahead of time. And that's one of the reasons that I wanted to put this course together is to help enlighten people as to the minutia here. Cause it's a lot of minutiae, like there's a lot of different details to consider when making these kinds of decisions. Jeremy: [00:15:13] I was listening to some of the interviews, and an example of something that you often see is people go like, okay, let's say I'm going to use JWTs but I don't know where to store it. And then in your course, you talk about it's probably best to store it in a cookie and make it HTTP only so that the JavaScript can't get to it and so on and so forth. And then you're talking to, I think it was Ben (Awad). And he's like, just put it in local storage what's the big deal? In general in tutorials that you see on the internet, but in this course, it's like, you're getting these mixed messages of like, is it okay to put it here? Or is it not? Ryan: [00:15:55] Yeah, it's interesting, right? Because I mean, maybe it's just like this common tack that bloggers and tech writers have, where, when they write something, I find this often and I like to go contrary to this, but what I find often, a lot of people are like, do this and don't do this. I'm giving you the prescription about what to do. Whereas I like to introduce options and develop a conversation around trade-offs and stuff like that. But I think where we see a lot of this notion that you're not supposed to put them in local storage or you're supposed to put them in a cookie or whatever, it's because we see these articles and these posts and whatever about how it's so dangerous to put it in local storage and you should never do it. And there's this pronouncement that it should be off limits. And the reality is that a lot of applications use local storage to store their tokens. And if we dig a little bit beneath the surface of this this blanket statement that local storage is bad because it's susceptible to cross-site scripting... If we dig a little bit deeper there what we find is that yes, that is true. Local storage can be (cross-site) scripted. So it's potentially not a good place to keep your tokens. But if you have cross-site scripting vulnerabilities in your application, you've arguably got bigger problems anyway. Because let's say you do have some cross site scripting vulnerability that same vulnerability could be exploited to allow somebody to get some JavaScript on your page, which just makes requests to your API on your behalf and then sends the results to some third party location. So yeah, maybe your tokens aren't being taken out of local storage, but if your users are on the page and there's some malicious script in there, it can be running requests to your API, your browser thinks it's you because it's running in the browser in that session. And you're susceptible that way anyway. So yeah, the argument there to say local storage is not a big deal is because yeah, cross-site scripting bad, but I mean, you're not going to take your tokens out of local storage and be like, ah, I don't have to worry about cross-site scripting now. I'm good. You still have to manage that. So and that's the other thing too, like there's so many different opinions on this topic about what's right and what's wrong. Ultimately, it comes down to comfort level and your appetite for putting in measures that are hopefully more secure things like putting it in a cookie or keeping it in browser state. But maybe with diminishing returns. There's maybe some point at which putting all that extra effort into your app to take your tokens out of local storage might not be worth it because if you don't have that vulnerability of cross-site scripting anyway then maybe you are okay. Jeremy: [00:18:40] And just to make sure I understand correctly. So with a cross site scripting attack, that would be where somebody is able to execute JavaScript on your website, usually because somebody submitted a form that had JavaScript code in it. And what you're saying is when you have your token stored in a cookie, the JavaScript can't access it directly, which is supposed to be the benefit. But it can still make requests to your API. Get information that is private in the JavaScript, and then start making public requests that don't need the token to other URLs? Ryan: [00:19:20] Pretty much. Yeah. So if you put your token in a cookie and it's got to be an HTTP only cookie, you need to have that flag set, then JavaScript can't access it. Your browser won't let you script to that cookie, cause like you can with regular cookies, you can do `document.cookie` and you can get a list of cookies that are stored in that browser session, I guess you would say, or for that domain or whatever. So yeah, you're right. You got it right. If someone is able though, to get some cross-site scripting, let's say you are keeping your token in a cookie. Someone gets some malicious script onto your page. They can start scripting things such that it makes your browser request your API on behalf of the user, without them clicking buttons and doing whatever, just says, make a post request, make a get request, whatever to the API... results come back. And then they just send them off to some storage location of theirs. I've never seen that in practice. In theory that can happen. It might be interesting to, to make a proof of concept of that. I mean, in theory all of that flow checks out but I've never actually seen it be done but theoretically possible. I'd be willing to bet a lot that it's happened in practice in some fashion. Jeremy: [00:20:39] So following the path of your course, it starts with cookies, it goes to JWTs. And then at the end of the advanced course, it goes back to cookies and sessions. And, so it sounds like if you're going to be implementing something yourself and not using a third party service, maybe implicitly you're saying that implementing cookies and sessions probably has a lesser chance of you messing things up if you go that route. Would you agree with that? Ryan: [00:21:12] Yeah, I think so. I think if you, especially like, if you follow the guidance, put forth by the library authors of these things, like maybe you got an express API. If you use something like `express-session`, which is a library for managing sessions. and you follow the guidance. You've got a pretty good chance of being really secure. I think, especially if you implement something like cross-site request forgery. If you have a CSRF token, that is, we can get into CSRF if you like, but if you have that protection in place, then you're in pretty good shape. You're using a mechanism of authentication that has been around for a long time. And it has had a lot of the kinks worked out. JSON web tokens have been around for a while, I suppose in technology years, but it's still fairly new. Like it's not something that's been around for a long, long, long time, maybe 2014 or something like that I want to say. And really just into popular usage man in the last like four years probably is how long that's been going on. So not as much time to be battle-tested. I've heard opinions before about like, man, we're going to see in a few years' time... So this plethora of single page apps, that's using JSON web tokens, we're going to see all sorts of fallout from that. I've heard this opinion be expressed where people who are not fans of JWTs are able to see their potential weak spots. They say like, people are gonna figure out how to hack these things really easily and, and just, you know, ex expose all sorts of applications that way. So, yeah. TBD on that. I suppose, but it's certainly there's potential for that. Jeremy: [00:22:46] And so I wonder how things change when, cause I think we're, we're talking about the case where you've got an application hosted on the same domains. So, let's say you've got your react app and you've got your backend, both hosted behind the same URLs basically, or same host name. Once you start bringing in, let's say you have a third party that also needs to be able to talk to the same API. How does that change your thinking or does it change it at all? Ryan: [00:23:22] Well, I suppose it depends how you're running things. I mean, Anytime that I've had to access third party APIs to do something in my applications. It's generally a call from my server to that third-party server. So in that flow, what it looks like is a request to come from my react application to my API that I control and the user would be validated as the request enters my application. And if it passes on through, if it's valid, then a request can be made to those third parties. For example, if you're using third parties directly from the browser, I can't think of too many scenarios where you would be calling your API and third parties using the same JSON web token. I don't know that that would be a thing. In fact, that's probably a little bit dangerous because then you're sharing your secret key for your token with other third parties, which is not a great idea. When third parties are in the mix, generally, it's a server to server thing for me. But you know, maybe there's a way that you validate users both at your API and at the third party using separate mechanisms, separate tokens or something like that. I don't know. I haven't employed that set up strictly from the browser to multiple APIs. For me, it's always through my own server, which is arguably, probably the way you want to do it anyway, because you're probably going to need some data from your server that you need to send to that third party and you're probably going to want to store data in your server that you get back from that third party in some scenarios. So I don't know if you go from your own server, you're probably better off I think. Jeremy: [00:25:02] And let's say you're going from your own server and you need to make requests to our API. Does that mean that from your server side code you're receiving the cookie and then you send requests from then on using that cookie even though you're a server application? Ryan: [00:25:23] Well in those scenarios. So this is interesting. This is where, people argue that JSON web tokens are a good fit for this scenario when you have servers to server communication. That article that I mentioned, about how you should just use cookies and sessions and stop using JSON web tokens. The argument boils down to the fact that JSON web tokens are not meant for sessions. They're not meant to be a replacement to a session. And that's how people often use them is like a direct one for one replacement to sessions. But that's not what they are. Tokens are an artifact that is produced at a point in time that you have proven your identity, that they're basically like a receipt that you proved your identity. And it's like, I don't know how to think about this. Maybe like when you go to an event or something and you get a stamp on your hands so that you can leave and then come back in, it's kind of like that. And then by the time that stamp washes off. It's expired. So you can't get back in that kind of thing right? So it's a point in time kind of thing. Whereas like a session would be more so like if you wanted to leave and come back from that event and every time you came back, it's like, Hey I was here just a second ago. Can you go to the back and just check with your records and make sure that I'm still valid or whatever. And then they do that and then they let you in, I don't know, maybe a bad analogy, whatever, but the point is that cookies and sessions versus tokens. It's a different world. It's not, they're not interchangeable, but people use them as if they are. So this article argues that a good use case for JSON web tokens is when you have two servers needing to communicate with one another needing to prove between one another, that they should have access to each other, and maybe they need to pass messages back and forth between one another. That's the sort of the promise of JSON web tokens is that you can encode some information in a token easily, send it over the wire and have it be proved on either end that the message hasn't been tampered with. That's really the crux of it with JWTs. So getting back to your question there, yeah, that's a good use of that. It's a good use case for, for tokens is if you need to communicate with a third party API, you wouldn't be able to do that necessarily with cookies and sessions. I mean, I don't know of a way to send a cookie between two servers. Maybe there's something there. I don't know of it. But yeah, JSON web tokens easily done because you just send it in the request header. So they go easily to these other APIs. And that, I mean, if you look at integrating various third party APIs, it's generally like, they're going to give you an access token. It might not be a JWT, but they're going to give you an access token and that's what you use to get access to those APIs. So yeah, that's how I generally put it together myself. Jeremy: [00:28:23] It sounds like in that case you would prefer having almost parallel types of authentication. You would have the cookie and session set up for your own single page application talking to your API. But then if you are a server application that needs to talk to that same API... Maybe you use, like you were saying some kind of API key that you just give to that third party, similar to how when you log into to GitHub or something and you need to be able to talk to it from a server, you can request a key. And so that that's authenticating differently than you do when you just log in the browser. Ryan: [00:29:07] Yeah I think that's right I think the way to think about it is like if you've got a client involved. Client being a web application or a mobile application, that's a good opportunity for cookies and sessions to be your mechanism there. But then yeah, if it's server to server communication, It's probably going to be a token. I mean, I think that's generally these days anyway, the only way you'd see, see it be done. I think like most of the third party services that I plug into, but the access tokens, they give you aren't JSON web tokens, from what I've seen, but if you're running your own third party services, for example, and you want to talk from your main API, perhaps to other services that you have control of then a JSON web token might be a good fit there, especially if you want to pass data encoded into the token to your other services. Yeah, that's what I've seen before. Jeremy: [00:29:59] And I wonder in your use of JSON web tokens, what are some common things that you see people doing that they should probably reconsider? Ryan: [00:30:11] Yeah. Well, One is that it's super easy to get started with JSON web tokens. If you use the HS 256 signing algorithm, which is kind of the default one that you see quite often, that is what we call a synchronous way of doing validation for the token, because you sign your token with a shared secret. So that shared secret needs to be known by by your server. And whichever other server you may be talking to so that you can validate the tokens between one another. If your API is both producing the token, so signing it with that secret and also then validating it to let your users get access to data. Super common thing to do, maybe not such a big deal because you're not sharing that secret around, but as soon as you have to share that secret around between different services, things start to open up for potentially that secret getting out and then whoever has that secret is able to sign new tokens on behalf of your users and use them at your API, and be able to get into your API. So the recommendation there is to instead use the RS-256 signing algorithm or some other kind of signing algorithm, which is more robust. It's an asynchronous way of doing this meaning that there's then like a public private key pair at play. And, it's kind of like an SSH key. You've got your public side and your private side. And so, if you want to be able to push stuff to GitHub, for example, you provide your public side of your key, your private side lives on your computer, and then you can just talk to one another. That is the better way of doing signing, but it's a bit more complex. You've got to manage a JSON web key set. There's various complexities to it. You've got to have a rotating set of them, hopefully, various things to manage there. So yeah, if you're going to go down the road of JSON, web tokens, I'd recommend using RS 256 it's more work to implement, but it's worthwhile. One thing that's interesting is that for a long time, eh, I can't remember if this is something that's like in the spec or not, but there's this big issue that surfaced about JSON web tokens, years back a vulnerability, because it was found out that if you, I think if you said the algorithm, so there's this header portion of the token, it's like an object that describes things about the token. And if you set the algorithm to none I believe it is, most implementations of JSON web token validation would just like accept any token I think that's how it worked is like you could set the algorithm to none and then whatever token you sent would be validated, something like that. And so that issue has been fixed at the library level in most cases. But I think it's probably still out in the wild, in some respects, like there's probably still some service out there that is accepting tokens with an algorithm set to none. Not great. Right? Like you, you don't want that to be your situation. yeah. So watch out for that. Other things that I can think about would be like, Don't store things in the token payload that are sensitive because anyone who has the token can read the payload, they can't modify it. They can't modify the payload and expect to, to make use of the modified payload, but they can read what's in it. So you don't want secret info to be in there. If you're going to do a shared secret, make sure it's like a super long strong key, not some simple password. I think because JSON web tokens are not subject to, so I guess I'll back up a little bit. If you think about trying to crack a password and trying to brute force your way into a website, by guessing someone's password, how do you do that? Well, you have a bot that is going to send every iteration, every variation of password I can think of until it succeeds. If you've got your password verification and things set up properly, you're going to use something like Bcrypt which inherently is slow at being able to decode or being able to verify passwords because you want to not allow for the opportunity that bots would be trying to guess passwords the slower you make it within reason so that a human doesn't have a bad time with it. The more secure it is because a bot isn't going to be able to crack it. But if somebody has a JSON web token, they can run a script against it that does thousands and thousands of iterations per second, arguably, or however fast their computer is. And this is something that I've seen in real life in a demo is somebody who is able to crack adjacent web token with a reasonably strong secret key in something like 20 minutes, because they had lots of compute power to throw against it. And they were just guessing, trying to guess various iterations of secret keys and it was successful because, you know, there's no limitation there on how fast you can try to crack it. So, and as soon as you do that, as soon as you crack it and you have that secret key, a whole world of possibilities for a hacker opens up because then they can start producing new JSON web tokens for their, for this application's users potentially, and get into their systems, very, sneakily. So. Yeah, make sure you use a long, strong key or use RS 256 as your algorithm. That's my recommendation. I think. Jeremy: [00:35:25] I think something that's, that's interesting about, this discussion of authentication and JSON web tokens is that in the server rendered space. When you talk about Rails or Django or Spring, things like that, there is often support for authentication built into the framework. Like they, they take care of, I try to go to this website and I'm not logged in. So I get taken to the login page and it takes care of, Creating the session cookie and putting it, putting it in the store and it takes care of, the cross site request, forgery, issues with, with a token for that as well. And, and so in the past, I think people could, could build websites without necessarily fully understanding what was happening because the framework could kind of take care of everything. And why do we not have that for say react or, or for Vue things like that? Ryan: [00:36:25] Yeah. Well, I think ultimately it comes down to the facts that these, you know, these libraries like react and Vue are, are client side focused. I think we're seeing a bit of a shift when we think about things like next JS, because there's an opportunity now. if you have a full kind of fully featured framework, like next, where you get a backend portion, you get your API routes. There's an opportunity for next, I hope they do this at some point in the future. I hope they take a library like next there's this library called next auth. very nice for integrating authentication into a next application, which is a bit tricky to do. It's it's surprisingly a little bit more tricky than you would think, to do. If they were to take that and integrate it directly into the framework, then that's great. They can provide their own prescriptive way of doing authentication. And we don't need to worry about it as much, maybe, you know, a plugin system for other services, like Auth0 would be great. but the reason that we don't see it as much with react, vue, angular, et cetera, is because of the fact that they don't assume anything about a backend. If you're going to talk about having something inbuilt for authentication, you need to involve a backend. that's gotta be a component. So yeah, that's, that's that's I think that's why. Jeremy: [00:37:35] I think that, yeah, hopefully like you said, as you have solutions like, like next that are trying to. hold both parts, in the same place that we, we do start to see more of these, things built in, because I think there's so much, uncertainty, I guess, in terms of when somebody is building an application, not knowing exactly what they're supposed to do and what are all the things they have to account for. so, so hopefully, hopefully that becomes the case. Ryan: [00:38:06] I hope so, you know, next is doing some cool things with like I just saw yesterday. I think it was that there's an RFC in place for them to have like first-class tailwind support, which means like you wouldn't have to install and configure it separately. Tailwind CSS would just, come along for the ride with next JS. So you can just start using tailwind classes, which I think would be great. Like, that'd be, I it's one of the least favorite things about, starting up a next project for me is configuring tailwind. So if they do, if they extend that to be something with off. Man. That'd be fun. That'd be good, Jeremy: [00:38:37] at least within the JavaScript ecosystem, it felt like, people were people, well, I guess they still are excited about building more minimal libraries and frameworks and kind of piecing things together. and, and getting away from something that's all inclusive, like say a rails or a spring or a Django. And it almost feels like we're maybe moving back into that direction. I don't know if that. Ryan: [00:39:02] Yeah. Yeah, I get that sense for sure. I think that there's a lot of people that are interested in building full stack applications. you know, it feels like there was this. Not split. That's not the right way to put it, but there's, there was this trend, four or five years ago to start kind of breaking off into like being a front end developer or a backend developer, as opposed to like somebody who writes a server rendered app. In which case you have to be, you have to be both just by the very nature of that. so there's specialization that's been taking place, and I think a lot of people now. You know, there's people like me who instead of specializing in one or the other, I just did both. So I call myself a full stack developer. and now there's this maybe trend to go to, to, to, to go back to like being a full stack developer, but using both sides of the stack as they've been sort of pieced out now into like a single page app plus, some kind of API, but taking that experience and then melding it back into the experience of a full-stack developer. I think we're seeing that more and more so. Yeah. I don't know. I think also like if you're. If you're wanting to build applications independently, or at least if you want to have like your head around how the whole thing works cohesively, you kind of need to, to be, you know, you need to have your head in, in, in all those areas. So I think there's a need for it. And I think the easier it's made while we're still able to use things like react, which everybody loves you know, all the better Jeremy: [00:40:29] Something else about react is something I liked about the course is, more so than just learning authentication. I feel like you, you, in some ways, teach people how to use react in a lot of ways. Some examples I can think of is, is how you use, react's context API to, to give, components access to requests that, that have the token included, or using your authentication context, that state to be able to know anywhere in your app, whether, you should be logged in or whether you're, still loading things like that. I wonder if there's like anything else within react specifically that you found, particularly useful or helpful when you're trying to build, an application with authentication. Ryan: [00:41:20] Yeah. In React specifically. so I come from mostly doing angular work, years back to focusing more on react in the last few years. It's interesting because angular has more, I would say I would argue angular brings more out of the box that is helpful for doing authentication stuff or helping with authentication related things than something like react. And, and maybe that's not a surprise because react is this maybe, maybe, maybe you think of reacting to the library as opposed to a framework. And angular is a framework that gives you lots of. Lots of features and is prescriptive about them. in any case, angular has things built in like route guards so that you can easily say this route should not be accessed unless some condition is met. For example, your user is authenticated. It has its own HTTP library, which is to send requests to a server X XHR request. And it gives you an interceptors layer for them so that you can put a token onto a header to send that. React is far less prescriptive. It gives you the option to do whatever you want. but some things that it does do nicely, I think which help for authentication would be like the react dot lazy function to lazily load routes in, in your application or at least lazily load components of any time. That's nice because one of the things that you hopefully want to do in a single page app, to help both with performance and with, with security is you don't want to send all of the stuff that a user might need, until they need it. So an example of this would be like, if you've got, let's say you've got like a marketing page. And you've got your application, maybe under the same domain, maybe under different sub domains or something like that. and you've got like a login screen. There's no reason that you should send the code for your application to the user's browser before they actually log in. And in fact, you probably shouldn't even put your login page in your react application, if you can help it. if you do have it in your application, you, at the very least you shouldn't ship all of the code for everything else in the application until the user has proved their identity. So to play this out a little bit, if the user did get all of the code for the application before they log in, what's the danger of that? Well, there's hopefully not too much. That's dangerous because hopefully your API is well-protected and that's where the value is. That's where your, you know, your data is it's protected behind a, an API that's guarded, but the user then. Still has access to all your front end code. Maybe they see some business logic or they see they're able to piece together things about your, your application. And, and a good example of this is do you follow, Jane Wong on, on Twitter? She finds hidden features in apps and websites through looking at the source code. I don't know how exactly how she does it. I'd love to talk to her about how she does it. I, I suspect that she'll like pull down the source code that gets shipped to the browser. She'll maybe look through it and then maybe she'll, she'll run that code in her own environment and, and change things in it to coax it, to, to show various, to render various things that normally wouldn't render. So she's able to do that. And see secrets about how about those applications. So imagine that playing out for something, some situation where like, You in your front end code have included things that are like secret to the organization that really shouldn't be known by anybody definitely recommend you don't do that. That instead you, you hold secret info behind an API, but the potential is still there. If it, if not that, then at least the user is figuring out how your application works. They're really getting a sense of how your application is constructed. So before you even ship that code to the browser, the user users should be authenticated. before you send things that an admin should see versus like maybe a regular user, you should make sure they're an admin and reacts lazy loading feature helps with that. It allows you to split your application up into chunks and then just load those chunks when, when they're needed. so I'd like that feature aside from that, it's kind of like, you know, one of the downsides of react I suppose, is like, for other things like guarding routes, for attachment tokens, this kind of thing, you kind of have to do it on your own. So yeah, react's lazy loading feature, super helpful for, you know, allowing you to split up your application. But aside from that other features, you know, it's kind of lacking, I would say in react, there's not, not a ton there to, to help you. You kinda have to know those things on your own. Once you do them a couple of times and figure it out, it becomes not so bad, but still some work for you to do. Jeremy: [00:46:04] Yeah. Talking about the hidden features and things like that. that, that's a really good point in that. I, I feel like a lot of applications, you go to the website and you get this giant, bundle of JavaScript. And, in a lot of cases, a lot of companies are using things like feature flags and things like that, where there'll be code for something, but the user just never gets to see it. And that's, that's probably how you can just dig dig through and find things you're not supposed to see. Ryan: [00:46:19] Yep. Exactly. You just pull down the source code that ends up in the browser and then flip on those feature flags manually, right? That's probably how she does it, I would assume. And it's very accessible to anybody because, as I like to try to, hammer home in my courses is that the browser is the public client. You can never trust it to hold secrets. you really need to to take care of not to expose things in the browser that shouldn't be exposed. Jeremy: [00:46:44] I wonder, like you were saying how in react, you have to build, more things yourself. You gave the example of how angular has a built-in way to, to add, tokens, for example, when you're, you're making HTTP requests, I wonder if you, if you think there's there's room for some kind of library or standard components set that, you know, people could use to, to have authentication or authorization in a react or in a vue, or if that's just kind of goes against the whole ecosystem. Ryan: [00:47:18] Yeah, I think there's room for it, I wouldn't be surprised actually, if you know, some of the things that I show in my course, how to build yourself. If that exists in a, an, an independent library. I think where it gets tricky is that it's often reliance on the context. And even on the backend, that's, that's being used to make those things really work appropriately. So. I guess a good example is like the Auth0 library that's allows you to build off into a react application. It's very nice because it gives you some hooks that, are useful for communicating, with Auth0 to, for instance, get a token and then to be able to tell if your user is currently authenticated or not. So they give you these hooks, which are great, but, but that's reliant on like a specific service. A specific backend, a session being in place Auth0 stores, a session for your users. That's how they're able to tell if they should give you a new token. I mean, I love to see that kind of stuff. Like if there's, you know, things that are pre-built to help you, but I just. I don't know if there's like a good way to do it in a, in a sort of general purpose way because it is quite context dependent, I think. But I, I might be wrong. Maybe there's value. In having something that, I mean, I suppose if you created something that you could have like a plug-in system for, so something may be generic enough that it gives you these useful things. Like whether the user is currently authenticated or, if a route should be guarded or something like this. If you had that with the ability to plug in your context, like, I'm using Auth0 or I'm using Okta, or I'm using my own auth. Whatever, maybe that maybe, maybe there's opportunity for that. that might be interesting. Jeremy: [00:49:03] Would definitely save people a lot of time. Ryan: [00:49:05] Yeah. Yeah, for sure. And that's why I love these prebuilt hooks by Auth0, for sure. When I'm using Auth0 in my applications, it certainly saves me time. Jeremy: [00:49:14] The next thing I'd like to ask you about is you've got a podcast, the entrepreneurial coder and, sort of in parallel to that, you've been working a quote unquote, normal job, but also been doing a lot of different side things. And I wonder, whether that's the angular security book or recording courses for egghead like how are you picking and choosing these things and, and what's your thought process there? Ryan: [00:49:43] Yeah, that's a good, good question. I have always been curious about business and entrepreneurial things and, and, and selling stuff, I suppose. I would say now I haven't been in business for a super long time, but I, I have been in business for a while now. and what I mean by that is, Yeah, you can take it back to like 2015 when I started at Auth0 at the same time that I was working there, I was also doing some side consulting, building applications for clients. And so since that point, I, you know, I I've been doing some side stuff. it turned into full time stuff back in 2017. I left Auth0, late 2017 to focus on, on consulting. Then 2020, mid 2020, I realized that the consulting was good and everything like that. And I mean, I still do some of it, but it's, it's difficult when you're on your own as a consultant to be learning. And, and that's something that I really missed is that I found that my learning was really stagnating. I learned a ton at Auth0 because I was sort of. Pushed to do so it was necessity of my job. As a, as a consultants building applications. I mean, sure. I needed to learn various things here and there, but I, I really felt that my learning had stagnated. And I think it's because I wasn't surrounded by other engineers who were doing things and learning from them and sort of being pushed to, to learn new stuff. So yeah, an opportunity came up. With, with Prisma, who I'm now, now, working at I'm working at Prisma and it was a, it started everything kind of fell into place to make sense for it. For me. I mean, I, I don't think I would take just, just any job. It was really a great opportunity at Prisma that presented itself. And I said, you know, I, I do miss this whole learning thing, and I do want to get some more of that. So, that's, that's probably the biggest reason that I decided to join up with a company again. So all that being said, the, the interest in like business yeah. And interest in doing product work, like a course, this react security course, or my angular book, that came out of just some kind of like, I don't know, man, like some kind of desire. I don't know when exactly it sort of sprouted, but this desire to create a product, to productize something, productize my knowledge and to create something out of that and offer it to the world and offer it, in a way where, you know, I could be, be earning some, some income from it. And you know, there there's various people I could point to that have inspired me along that journey, the, the most prominent one is probably Nathan Barry. I don't know if you're familiar with Nathan Barry. He's a CEO of convert kit and I've followed his stuff for a long time. And he was very, always very open about, his, eBooks and his products that he creates. And that that really inspired me to want to do something similar. And so, in trying to think of what I could offer and what I could create, authentication and, identity stuff kind of made sense because I acquired so much knowledge about it at Auth0. And I was like, you know, this probably makes sense for me to to bundle up into something that is consumable by people and I, and hopefully valuable. I sense that it's been valuable as certainly the response to the products has told me that it has been valuable. So yeah, I think that's the genesis there. And then the. interest in business stuff I think I've just, I don't know. I've always been curious about, especially about like online business, maybe, maybe less so with like traditional brick and mortar, kind of business, but, but, but online business in particular, it just the there's something alluring to it because like, if you can create something and offer it for sale, there's no, there, I'm sure there are better feelings, but I was about to say there, there's no better feeling than to wake up to an email saying that you've made money when you're you've been sleeping right. And that's, it's a cool feeling for sure. so. Yeah, I think that's, that's probably where that comes from. Jeremy: [00:53:34] And when you first started out, you were making courses for egghead and frontend masters, does that feel like very distinctly different than when you go out and make this react security course? Ryan: [00:53:49] Yeah. Yeah, definitely. Yeah. For a few reasons. I think that the biggest and probably the overarching thing is that when you do it on your own, you create your own course, the sense of ownership that you have over it is makes it a different thing. I'm in tune quite deeply with the stuff I've created for like react security. And even though I've done what three, I think front-end masters courses, you know, I've been out to Minneapolis a bunch of times. I know the people there, quite well it's. There, isn't the same sense of like, ownership over that material, because it's been produced for another organization when it's your own thing. It's, it's like your baby, and you care more deeply about it. I think so. So yeah, that's been, it's a different experience for sure. For that reason. I think too, there's the fact just the, you know, getting down to the, the technicalities of it. There's, there's the fact that you are responsible for all of the elements of production. You know, you might look at a course and be like, okay, it's just a series of videos, but there's tons of stuff that goes into it. In addition to that, right. There's like all of the write-ups that go along with each video. There's the marketing aspects like of. You know, putting the website together and all, and not only that, like the there's, the lead-up, the interest building lead up over the course of time through building an email list that, and you collect email addresses through blog posts and social engagement. It was building up interest on social over the course of time, like tweeting out valuable tidbits. there's just this whole complex web of things that go into putting your product together. Which means that you need to be responsible for a lot of those things, if you want it to go well, that you don't really need to think about if you're putting it on someone else's platform. So there's, trade-offs either way definitely trade-offs. But, I mean, it depends what I want to do, but I think. If I were to do a product with some size to it, it would be I'd want to do my own, my own thing for sure. Jeremy: [00:55:48] Are there specific parts of working on it on your own what are the parts (that) are just the least fun? Ryan: [00:55:57] Yeah, the parts that are the least fun. I suppose for me, it's like, probably editing, editing the videos is the least fun. I've got this process down now where it doesn't take super, super long, but it's still, It's a necessity to, to the process that is always there, editing videos and, and the way that I. I dunno, I'd probably do it in a I I probably put this on myself, but I, the way that I record my videos, I editing takes quite a while. yeah, I, I think that is not such a great part about it. what else? I guess there's like the whole aspect of like, if you, if you record a video and you've messed something up and the flow of things doesn't work out, then you have to like rethink the flow. There's any anytime I've got to do anytime I've got a backtrack, I suppose that that takes the wind out of my sails for a little while. Yeah, those are the two that stand out the most. I think. Jeremy: [00:56:51] Yeah. I noticed that during some of the videos you would, make a small mistake and correct it like you would say, Oh, we needed these curly braces here. I was curious if that was a mistake you made accidentally and you decided, Oh, you know, it's fine to just leave it in. Or if those were intentional. Ryan: [00:57:10] Definitely no intentional mistakes. I'd love to post a video sometime on a raw video that isn't edited because you'd laugh at it. I'm sure it's like, inevitably in every video, there's what it looks like. If you, if you were to see it unedited is a bunch of like starts and stops for a section till I finally get through and then a pause for awhile and then a bunch of starts and stops to get to the, to the next section. So like, "In this section we're going to look at", and I don't like the way that I said it. So I start again "in this section, we're gonna look at" and then there's like five of those until I finally get it. And so my trick with editing is that I will edit from the, the end of the video to the start so that I don't accidentally edit pieces of the video that I'm just going to scrap. Anyway, that's what I did for a long time. When I first started, it's like, I'd spend like 10 minutes editing a section, and then I get further along and I'm like, Oh, I, I, I did this section in a better way later and now I have to throw it, all that editing that I just did. So if I edit from the back to the to the front, it, it means that I can find the final take first, chop it when I start to take and then just get rid of everything before it right. So yeah, there's, there's definitely no intentional mistakes, but sometimes like, if there's. If there's something small, like it's, Oh, I I mistyped a variable or something like that. I'll often leave that in. Sometimes it's a result of like, if I do that early on in the video, and then I don't realize it until the code runs like a few minutes later. I'm just like internally. I'm like arggh! But I don't want to go and redo the video. Right. So I'm just like, okay, I'll, I'll leave that in whatever. And some of, some of those are good. I think in fact, I arguably, I probably have my videos in a state where they're, they're a bit too clean, like, cause I've heard a lot before where people don't mind seeing mistakes. They actually kind of like when they see you make mistakes and figuring out how to. How to troubleshoot it, right? Jeremy: [00:59:02] The, Projects that you've done on your own have been learning courses, right. Or they've been books or courses. And I wonder if that's because you, specifically enjoy making, teaching material, or if this is more of a testing ground for like, I'll, I'll try this first. And then I'll, I'll try out, a SAAS or something like that later. Ryan: [00:59:22] Right. Yeah. I definitely enjoy teaching. Would say that I, am passionate about teaching. maybe I'm not the most passionate teacher you'd come across. I'm sure there are many others that are more passionate than I am, but I enjoy it quite a bit. I mean, and I think that's why I like the kind of work that I do. The devrel work that I do at Prisma. You know, I enjoy being able to offer my experience and have others learn from it. So at this point, like if I'm making courses or books or whatever, and I don't know if I'll do another book. I mean, writing, writing a book is a different thing than, than making a a video-based course. And, and I don't know if I would do another one but in, in any case, I don't think it's a proving ground. I'll do other courses regardless of what other sort of things I make. But speaking of SAAS, that is an area that I'd like to get into, for sure. As time wears on here, I'd love to start and I've got a few side projects kind of simmering right now that that would be in that direction. So yeah, I think that that's, that's an ideal of mine is getting to like a SAS where you started hopefully generating recurring revenue more, you know, more. More predictable recurring revenue, as opposed to, you know, of course it's a great, because you build up to a launch, hopefully you have a good launch. It's a pretty good amount of money that comes in at launch time, but then it goes away and maybe you have some sales and you generate revenue as time goes on. But, less predictably than perhaps you would in a SAAS if you have one. So yeah, that's kind of where my head's at with that right now. Jeremy: [01:00:51] I'm curious you, you mentioned how you probably wouldn't do another book again. I'm curious why that is and, and how you feel that is so different than a video course. Ryan: [01:01:02] Yeah, I think, I think it's because I enjoy creating video based courses more than I, I enjoy writing. I like both, but writing less so. And I also think that there's this perception when you go to sell something, but if it's a book it's worth a certain price range, and if it's a video course, it's worth a different price range that is higher. I don't think that those perceptions are always valid because I think the value that can come from a book, if it's on a technical topic that is going to teach you something, you know, if you pay $200, $250, $300 for that book, which is in our world, that's a PDF. Right. I think it's perfectly valid. I think that, people have this blocker though, when it comes to that internally, they're like, there's no way I'm going to pay that much for a PDF, but the same information in a video course, people are often quite, amenable to that. So there's that. I happen to enjoy creating video courses more than I do books. I think that's the reason that I probably wouldn't do a book again. And a book is a book is harder to do than a video course. I think writing a book is, is a difficult, is a difficult thing. For whatever reason. I, I couldn't tell you like technically why or specifically why, but I sense, I've heard this from a lot of people that making a book is harder than making a video course. Yeah. Jeremy: [01:02:28] Yeah, I have noticed that it feels like everybody who tries to write a book, they always say it was way harder than they thought it was going to be. And it took away longer. Ryan: [01:02:36] Yeah, for sure. Jeremy: [01:02:37] I do agree with the value perception. I don't know what it is, but maybe it's just the nature of the fact that, there's so much, you can just go online and see in blog posts and stuff that maybe for some reason, people don't value it as much when it's all assembled and, writing form. But like the pitch, even your course on the landing pages, you say like, yeah, you can technically find all this information online, but you know, just think about how much time you're going to spend looking for it and not knowing whether or not it's, it's valid or not. Ryan: [01:03:14] Yep. Absolutely. That's the big, the big sell I think, with, with, online education, rather than having you assemble all that knowledge yourself over the course of time, the value proposition is I will give it to you. I will show you my lessons learned and give it to you succinctly in a nicely. Kind of a, you know, collated fashion. Jeremy: [01:03:34] Another thing that, I enjoyed about the course is that it was very to the point, sometimes you'll see an advertisement for a video course and they'll say like, Oh, it's got 20 hours of content, but it's like, the fact that it's long doesn't mean you learn more. If anything, you would think you would want to spend less time to learn the same amount of material. So, yeah. I definitely appreciated that as well as the fact that you've got the transcripts underneath the video. Ryan: [01:04:05] Oh, that's good to hear. That's awesome. Yeah. Jeremy: [01:04:07] Cause it's like, I could watch a video and then get to the end and like, maybe I'll remember some things, but other things I won't. And then to, to pick it up, I have to go watch the video again. whereas when you've got that, that transcript, that's pretty helpful. Ryan: [01:04:23] That's great to hear. Yeah, I was, I wanted to make a point of including transcripts, you know, for accessibility reasons, for sure. But, but also because you know, something that I have realized as time has gone on is that people do appreciate the, the written form. I always thought that maybe it's a little bit weird to read something that is like, the narration of a video, because it doesn't translate perhaps super well if it's in, in, in print, but I think, everything that I've heard about the transcripts has been positive. Like, like you've said, so I'm glad to hear that, people seem to like them. Jeremy: [01:04:54] The last thing I'll ask is. I guess when authoring courses, or even just when you're releasing a product on your own, what are like some things that you think are really good ideas and what are some things that, you've realized like, Oh, maybe, maybe not so much. Ryan: [01:05:09] Yeah. So, what I'd offer there is. it's a good idea to try to validate what you, want to create as a product. If you, if you're going for, if you're going for a sale, like right. If you want to, if you want to sell the thing, if you want to make a decent amount of money, it's a really good idea to try to validate it early on and you can do that without much effort. I think by doing smaller things ahead of time, it's things like creating couple of videos for YouTube see what the response is there. offer tidbits on places like Twitter and see what people, how people respond to them. One of the ways that I validated these courses was by doing just that, like tweeting out kind of small, valuable pieces of content, writing blog posts, and seeing how people responded to them. And then the, like the biggest thing, though, for me for this particular course was I created the free, intro course, the react security fundamentals course. And after I saw, you know, the number of people signing up for it, I was like, okay, I think there's enough volume here. A volume of interest to tell me that if I was able to offer a sale to even a small percentage of that, that list that it would be worthwhile. The other thing that I, would recommend is that you, that you build up an email list, you find some way to get in touch with your. audience and keep in touch with them. And that's, that's often through email. if you can build up an email list and offer valuable things to them over the course of time, they will have you in mind, they will feel some need for reciprocity when it comes time to, for you to offer something for sale, do all those things. well before you go to release your course, the last thing I want for anyone is to release the course and nobody is there to, to buy it. And, you know, unfortunately I think a lot of people. have the assumption that if they just, you know, create it, put it up on the internet, it'll sell somehow. But the reality is you almost certainly need to have develop an audience beforehand. At least if you do, that's the way that, you know, you can make a meaningful sort of, I guess, a impact, but also be. financial results from going down that road. So build that audience, I think is, is the key there. Jeremy: [01:07:18] Cool, that's a good place to, to start wrapping up, if people want to check out your course or see what you're up to, where should they had? Ryan: [01:07:26] Sure. Yeah. Thanks so much that you can go to react security.io, and I'll get you some links, to put in the show notes or whatever, but react security.io. There's a link there for. The advanced course. you can check me out on Twitter. I'm at @ryanchenkie on Twitter and, where else I've got a YouTube channel that's my name? Ryan Chenkie has my name on YouTube. I've got some sort of free stuff there. Yeah. Check those places out, you know, kind of find your way around. Jeremy: [01:07:50] Cool. Well, Ryan, thank you so much for chatting with me today. Ryan: [01:07:54] Thanks for having me. This has been great. I was happy to be able to do this. Jeremy: [01:07:58] Thanks again to Ryan Chenkie for coming on the show. I'll be taking a short break and there will be new episodes again next month. i hope you all have a great new year and I'll see you then.
undefined
Dec 2, 2020 • 50min

iOS Development with Timirah James

Timirah is an iOS developer, developer advocate, founder of the TechniGal LA meetup group, and instructor for O'Reilly and Coursera. Related Links @timirahjSwift OptionalsStanford CS193p - Developing Apps for iOSHacking with SwiftRay WenderlichSwiftObjective CXcodeFlutterDart Music by Crystal Cola. Transcript You can help edit this transcript on GitHub. Jeremy: [00:00:03] Today I'm talking to Tamirah James. She's been an iOS developer for... how many years has it been? Timirah: [00:00:09] Oh my goodness. Since I first started iOS, I want to say ooh (whispering) oh my gosh seven years. Jeremy: [00:00:15] Seven years. Where did that time go? Timirah: [00:00:17] My goodness. Oh yeah. Seven years. Wow. Jeremy: [00:00:21] I think a lot of people listening have written software before, but maybe not for iOS. So I'm interested to hear what your experience is and how you think people should get into it. Timirah: [00:00:31] Absolutely. So, yeah, and excuse me, we, we talked about my, before, before we, cut the podcast on, we were talking about how I have this 8:00 AM rasp in my voice (laughs) so you'll have to excuse the rasp. But, yeah, I, like we said, you know, seven years with iOS development. The great thing, and the crazy thing about, you know, my journey in iOS is I really went in with purpose and, intention, right. I went into mobile development knowing, that it was going to be something that was valuable not only into my career, but valuable to, society as a whole right. Valuable to the industry as a whole. So it, it goes all the way back to like, when I first started coding and I was like, I want to say I was 17 years old. and I started to get into web development and, I got bamboozled into getting into a, like a high school, like internship thing, to learn web design. and I loved it and that was my first introduction to coding. And I was like, Oh my goodness. Like, this is amazing, like web development, web design. And then once I started to think about college and the route that I wanted to go, I was like, okay. Hm. There's so many other avenues. And so many other doors in this industry, like. You know what? I know that I want to pursue computer science but what, what is going to be my path? So I started thinking about my career pathway immediately. I'm like, okay, going to college then what? And at that time, I think like we were just making the transition from like flip phones. And like the cameras and the, Oh my gosh. If you had the video phone, you call them the video phones, video phones. And then the transition into like, Oh, now we're at, iPod touches. Now we're at iPhones now or at we're like, okay. Oh, wow. This is like the birth of the smartphone, the birth of the iPhone, and really seeing the transition. The phone being something that's way beyond just a communication device right now it's media, now it's entertainment, now it's education, it's finance, it's, all of these things. I always say like, you can leave your laptop at home, but if you leave your phone, You're like, Oh my gosh like, how am I going to function throughout the day? If I can build something for that, I can make a huge, huge impact so I really went into it with all, like, it was very much, something that I went with all intentions, all intentional purposes. So I said, you know, after college I'm going to go right into mobile development. I really think that I will make the biggest impact there and build things that will stick with people and people will spend time with, and people enjoy, and will live with users. I fell in love with the idea of the evolution of the phone and mobile development. So went to college, ended up dropping out in that first year, computer science, for a lot of reasons, one of them being that the university I went to wasn't a, tech school, so to speak, computer science was not one of their pillars. You know, one of their like, Oh, this is something that we're focusing and we have resources and we have a gateways for mentorship and internships. Like they were just like, Hey, we got computer science is here, you know? and yeah, so I went and I took that opportunity. And it just, wasn't what I thought it would be. I didn't have any support coming in, you know, just learning, computer science and, you know, there were so many different aspects to it. It was culture, there were, cultural aspects. There was just the support in terms of what was being taught. and I just felt like the more I was in it, the more behind I felt, I felt like it was being pushed out. and yeah, so I left school and I was like, okay, where do I, where do I kind of go from here? And again, I knew that I still wanted to do mobile development. So I continued that journey, online there were, there were, maybe some coding, boot camps, that were really underground and just getting started or getting their feet wet and trying to figure that out but none that were really visible like that and really like, Oh, this is an option. So didn't have any like bootcamp or coding school, situation. But I had so many resources online. I can remember my primary source of education at that time, along with like, everything like Ray Wenderlich, and, tree house, coursera, which is crazy cause I'm teaching courses for Coursera now. And, in addition to that, my primary source was Stanford's iOS development course, which they were offering all the lectures for free. And I was upset cause I was like, wait a minute, the school I was going to. It was like this private university, you know, and here I could have a Stanford education, you know, in which I kind of ended up with it, so, okay. I have a Stanford education in iOS development (laughs) . so yeah, I kind of continued on from there and really went cold turkey. I know that like, it's hard for a lot of people to adapt to like being self-taught in really the meaning of being self-taught right, where I don't really have a physical community or any source of like mentorship around me saying, Hey, you should learn this. Or you should learn that. I didn't even know where to look for that kind of guidance. I didn't really have, a boot camp that was promising me a roadmap, a career ladder. what the important tech stacks for mobile development were. all that groundwork. Like I really did it myself and, I had the support of my mom. I didn't work another job. Like I was living with my mom. and she's always been single parent so one paycheck. And, It was really, maybe less than a year before I, started into my professional, career and got my first role landed my first role as an iOS engineer. So, anything you want to learn, you want to make sure that you do the work right? Outside of, even if you're at a coding bootcamp, you have to want to learn it for yourself. and then you have to look at the benefits, what, what are the benefits of learning? Something like this. iOS development. Okay. You know, that it has a very strong Apple developer community around it, and then Apple, the entity itself, Apple, really prides itself on making sure that they give an abundance of resources to developers, you know, via, you know, th the WWDC, you know, which is their yearly conference that they do. And they make sure that they are. flooding developers with new things and, new information and, you know, hands-on training, in the best ways that they can. So there is a lot of resources out there. and just in terms of iOS development itself, the benefit of iOS. again, it, iOS can be very niche and that's something that probably turns people away. They're like, Why iOS development, you know, Android you have a plethora of different devices. you have, I guess, more scalability in terms of who's able to access, your applications, right? You think about people in other countries like the iOS device is probably the number one device in the United States. But outside of here, like, you know, I've been to Europe, I've been to a couple other places and. They honey, they are Android down. They are just like it's all Android over there. you know, and that has a lot to do with just pricing and all these things a lot of people are turned away just because it is such a niche and I call it like this King Apple theory where, you have to use an Apple language, and build on an Apple IDE. Right on an Apple machine, for Apple products, you know, so they do a great job of making sure that ecosystem is tight. and that, that compatibility and the conversions are really just like very much limited outside of that ecosystem. So yeah, like I think it's definitely worth it. you know, because there is so much of a focus, you know, there's a lot of benefit in that as well. So yeah. Jeremy: [00:09:18] What would you say are the biggest reasons or benefits for choosing to develop for iOS versus Android? Timirah: [00:09:24] I think it really just comes down to, when it comes to the benefits of iOS itself, again, it comes down to, for one stability, right? Sometimes it is harder when you have so many devices to build for, right? So you have this like ecosystem of Apple products we have like, for specifically for iOS development, it's just the iPhone. We have different versions of the iPhone. of course they're multiple versions that are, that, you know, keep upgrading as far as the operating systems like iOS versions. And th that's basically the end of your worries when it comes to like support and compatibility. but there's so many different like operating systems when it comes to end devices and device sizes and so on and so forth when you're dealing with, Android development. Right. And so something that might work or look good for, six or seven out of 10 devices Android devices, you may have some users out there, it, the functionality isn't as, optimal, you know, it isn't as reactive. Just because, there's a slight difference in the operating system. There's a slight difference in, the device, the model. That wide range, it can be a bit of a hindrance there. iPhones keep continuing to like upgrade and the iOS versions can continue to go up. Like you have less and less of a variety of devices that you have to like really worry about. Right. And so there's a lot more, consistency and again, stability, when you're dealing with iOS development. And some people see it as a blessing and a curse that there's so many different languages that you can utilize for Android development, as opposed to with iOS development. Again, you have to use an Apple language, not unless you're doing like cross platform situation. you know, which is like a handful of frameworks out there. But if you want a native rich experience, you're going to go, native Swift or are you going to go native, you know, objective C. So, yeah, so there's this like, this, this aspect where it benefits you to remain in that consistency and that, and that stability and because. Apple knows that everything, all of the development is weighted on those languages is weighted on, that IDE, which is Xcode. they put all of their focus into making sure that those things are as stable as possible. As opposed to, Android, where again, Yeah, there's a blessing and a curse in how much that variety is. So I think that's probably the main thing when it comes to that big difference and the benefit of, building for iOS. Jeremy: [00:12:24] It, it almost sounds like for an independent developer or for a smaller company, it might be easier to take on iOS development, because as you said, there, there aren't as many devices you have to make sure that your software works on. With iOS there is just Apple's operating system. There isn't a version by Samsung and by Huawei and by Google and so on. And then in the tools themselves, you said there's a lot of different options for languages and IDEs for Android. Whereas for iOS, everything's more. centralized and Apple is that's really the main choice, right? So they're forced to push all their efforts into that. It's almost like different philosophies, right? You get a bunch of options, or you have more limited choices, but maybe there's less for you to have to worry about less for you to have to learn and maybe have some more focus there. Timirah: [00:13:22] Absolutely. You know, Apple is doing everything to make sure, try and make sure that they are a global, they're definitely one for global dominance, right? (laughs) I don't know why that sounded scary in a way, but, you know, Apple wants to be number one, you know, they want. Yeah, of course they want the iPhone to be the number one device in the world. Right. but then there's this issue of like, you know, pricing of course, like, okay. A thousand dollar phone is probably not popular. Like everyone, like America has this, this thing too. Like, America has a culture of thinking differently, right? Like even down to, like you ever had a friend who's like, they text you their number and you're like, Oh, a green bubble came up. I've seen like, like this among my friends. Like if you text me and if I see a green bubble, I'm going to give you a side eye, you know? (laughs) Like, you know, so it's like, okay, so it's even down to that. Right? So beyond just developers, like down to users, like, Oh, you have a green bubble, you know, there's this, there's this thing, which is stupid and it's hilarious. But it is the thing, like, you know, if you want the top phone, you want top performance, you want the top camera, even if, you know, some of the features are very much copy-paste from Android. If you see some of the features, sometimes when Apple is doing the rollouts and they're like, Android, this... this phone, the Galaxy has always had this feature, you know, now Apple is rolling it out like it's so new and it's so innovative. But because it's on the iPhone now it is a new trend. Right? Just because it, it has set that status quo in the culture. There's this thing where like, centralization is just a big thing, when it comes to iOS development. Jeremy: [00:15:14] Earlier you were talking about how you might build your application with Swift, or you might also have Objective C could you kind of go into why there's these two languages and when you would use each? Timirah: [00:15:29] Yeah. So, Back back back in the day, Objective C was created over 30 years ago. and that was the sole language for these past like three decades, for development in the Apple environment when it comes to, building applications. there was just one language at this, this whole time. Before 2014 and that's when Swift was born and it kinda hit everybody by storm because it was just like, okay, there was just no inkling. Like we had no inkling that, that Apple would even do something like that or produce something like that, like a brand new language. so the Apple developer community was like, Oh, what what's happening here? They were just like shaken up. the differences in the languages are very, very drastic. Objective C is known for being a very ugly language. (laughs) Very verbose, very, I mean, double down on like boiler plating, there's a lot of redundancy, right. And the benefit in a language like that is, you know, yes, it's ugly and hairy for a beginner. I had to make lemonade in my mind in order to learn it. You know, all of this, like, you know, redundancy is kind of teaching me syntax and teaching me logic, right. If I have to keep making a reference to this, if I have to keep doing this, if I have to keep doing that, if I have to add the boiler plate, kind of code in, it's embedding this logic into me, you know, as a beginner, like, Oh, this, you know what this actually means every single time, once you're an expert in that, like it's just annoying. Okay. so people were tired of it. it was an ugly language, but it was the only language at the time. So 2014 Swift was a breath of fresh air, very simplified, straightforward, very clean, very legible readable language. when you compare it to objective C, so something you might do in objective C with like 30 lines of code, you might do with a Swift with like eight or nine lines of code. so it was like, it was definitely like, Whoa, this is huge, huge thing. So, and then Swift. Has gone through these six years, going through this huge rapid, climb just in popularity and in evolution. So it went from like, now there's another option for, you know, building, iOS applications. Right. you can use this new language then like the very next year, like Swift becomes open source now, like, Oh my goodness, what does this mean? Like, Oh, we talked about like Apple being very much like closed off and very like strict on their programming language in their platforms. And, you know, everything is being built by Apple. So like this new layer of transparency saying, Hey, we open source this language. you know, so you can see like everything and not only just see everything, but also like contribute and like make it what you want it to be. So open it up to like, the community to build up, and create the language of their liking. And it was a real, a huge deal. And it made a huge statement in the Apple developer community, and it also attracted other people to the language cause they were curious. Right. and once it became open source, like the, the, the flood gates kind of open it transitioned Swift as a language transitioned into this, like, okay, now we see all these benefits of the Swift compiler and, you know, runtime. can we use this in other areas, like beyond iOS development and now next thing you know, you're seeing Swift on the server, you know, server side Swift frameworks, you're seeing, you know, people building web applications with Swift, and it, it just kinda grew a pair of wings and it's all in, it's been since its birth. It's been like this top 10 language to learn. if you've seen like basically any tech publication or blogs, like you say, top 10 languages to learn for 20, 20, or 2019 or whatever. And Swift has always been on the list. And a lot of people are like, okay, I understand why this language or that language, but why Swift, why are they suggesting Swift? Are they, if I'm being recommended Swift as a top 10 language to learn, am I being encouraged to learn iOS development? Is that where this is going instead of acknowledging Swift the programming language right in itself and the benefits of the language and what you can do. So, yeah, it kind of offered up this opportunity for Swift to, like I said, grow a pair of wings, and now you can, you can actually be a full stack developer using purely Swift. You know, you can build your native front end with Swift and you can build a backend with Swift, build a whole like rest API. So. Yeah, like there's this whole whole situation. So now you have objective C and you have Swift, you know, now objective C. there's this talk that I like to give it's called, Swift for Objective C OGs. and basically it's encouraging people who are still in that objective C bubble to take the jump, learning Swift because every year, you know, during WWDC, you don't see Apple coming out and saying, Hey, here's some new updates with objective C here's a new framework for objective C like no, there's no, new, there's there, there are no new updates when it comes to objective C and I alluded to the fact that that may be due to Objective C becoming even more of a legacy language and maybe in the, you know, far in the, I don't know how near or far that future, that, that, that destiny is, but, you know, maybe one day become deprecated. Objective C is something that a lot of enterprise companies still use just because, you know, if you, if you're Google or someone your legacy code is built in objective C. Now there's this new language. Right? So now you know, you see all the benefits of Swift and the compiler and you see all of this. So now there's this thing where now like, okay, you either hire developers who know Swift to help make the transition, or you train your developers that are working on the iOS products to learn Swift. So they can help make the transition, but that, that process is, you know, it takes two years, to properly make that migration. Even if you have to mix the two, right. It, it takes a while. So, yeah. So objective C is more of the, I don't know if you'd call it grandfather, but definitely the father. And then you, you know, Swift is the, the new baby that is like taking over the world right now. Jeremy: [00:22:38] So if you're making a new iOS application you could do it entirely in swift you don't need to even look at Objective c Timirah: [00:22:45] Absolutely. Yeah, you can make it just Swift. If you open XCode and then it'll ask you, if you, you know, if you create a new project that asks you, you know, what language are you doing this in, are you doing Swift or objective C? You can do it either or, and also you can use, if you want it to, let's say You're building application in Swift, but there's this like, third-party like library, that's doing some cool. I don't know, animation or something, but it hasn't been converted to Swift. And it's a, it's in objective C you can use a bridging header to basically, pull that library in and, take advantage of that functionality from that third party library and continue to, you know, build their application in Swift. So yeah, there's a way you can, you can use both. You can use Swift. They are definitely their individual languages. but yeah, absolutely. And Swift has like this huge, interoperability, factor. And I talked about this, like in my, O'Reilly course as well, like. It's a huge deal. That, that, that component, is a huge deal because like you have other developers, not only just people coming from the Apple community who are using objective C, but you have developers who are coming, you know, now trying to learn Swift, but they're coming from a Python background or, you know, they're using it for different reasons. And that, that component, that element of Swift, it helps to bridge that gap. Right. so if you're learning Swift and again, you're still trying to use some Python, like libraries or integrate some like Python modules. You can, utilize that with Swift. yeah. So I think it's like supported with a with python, like C++ or C. And that, that is a huge benefit when it comes to Swift, it's just like, okay. And that that's a big help. when someone not only is building from mobile, but like coming from, like I said, other backgrounds and trying to build something with Swift that may or may not have something to do with iOS development per se. Jeremy: [00:24:51] I wanna go a little bit into Swift the language itself. If, if you're someone who's familiar with, like you said, Python or JavaScript or Java, how does Swift compare what what's similar and what's different? Timirah: [00:25:07] At first glance, there's so many similarities in syntax alone. when it comes to JavaScript, which is really friendly for like people who are coming from JavaScript, or languages that JavaScript maybe similar to as well. Right. even down to like the var keywords, and you know, one thing about Swift though, is it's, it's very statically typed. and there's this element of like type safety, Swift kind of prides itself on being a very safe language and that contributes to its performance. Right. It's very much, it's the speed is probably what makes it so popular. amongst developers, it's just like, Oh wow. This compiler is like, okay. Yeah (laughs) and that's due to it being such a. safe language, right? So like the type safety and generics kind of, it makes it, that safe language and it prevents like the user from, basically passing in like incorrect types. And, you know, it's just very, like, it will yell at you. It will definitely (laughs) yell at you, because it just prides itself on, preserving memory and, just being very, very safe. There is one thing that I always talk about. Which took me by surprise. I was saying I was shook af I was shook af, when I, the main thing was, this thing called optionals. And I don't know if you've heard about the Swift optionals, where they're probably the most annoying thing you will come across when you're dealing with Swift. and basically you identify optionals with these, question marks and exclamation points, and I'm like, Oh my goodness, more punctuation. What does this mean? I've never put a question mark in a line of code in my life. Okay. Exclamation points, you know, and we have a different reference for like exclamation points. Right. So anytime I was coding before Swift, you know, anytime I use an exclamation point, it was to say, like, not something, right. So not equal something. but no, it didn't mean that at all. Basically these optionals are attributed to type safety, and talking about protecting values, and checking for them before, before everything kind of compiles. so that means, you know, if you have the value that may or may not be nil, Swift offers an opportunity for you to wrap that value, and say, Hey, This may or may not be needed. We're going to check for that. But whether it's nil or not, this function is not going to break and the app is not going to crash right. So we're not even taking up that space. And then it you the option to unwrap that value and say, Hey, I think it's safe. This value is pretty much not going to be nil. and yeah, but it forces you to declare that and it forces you to, constantly be thinking about the memory and thinking about type safety at all times, you know? We talked about objective C giving you that mentality of redundancy or of what was important right. and what's important for Swift is performance and it's, like, you know what, I'm not even (hits the mic) going to give you the opportunity to, (laughs) slow me down or make your app. You know, make this app application crash, based off of, you know, some value that, you maybe forgot about, or it just became nil, because you forgot to do X, Y, and Z. I'm just going to wrap this thing, for you and I'll let you know what comes of it. Yeah, so th that's a huge thing that, like, It's not something that's unconventional when it comes to other programming languages. but the approach and the syntax of it is a little bit different and it might throw some people coming from other languages off. So I think that's a huge difference, you know, when you see the syntax, but yeah, this is familiar whaaat a question mark? that's that's one of the one of the big differences, as well. Jeremy: [00:29:18] So to see if I understand correctly, this, this optional type, the real problem it's trying to solve is when you work with languages like JavaScript, and I'm sure this is the case with Python as well. There can be cases where you have a function and it takes in parameters. But one of those parameters could be nil or null. And when you try to use that function, you'll get an error at runtime where it'll say something like undefined is not a function or just no reference exception or something like that. It'll just blow up. Yeah. And it sounds like what the optional type is attempting to do is to have you tell the compiler through your code, that this value, it could have the value in it, or it could be nil. And, the compiler then knows if you've handled that case of the fact that it could be null, or, or it could have something in it. Did I kind of get that right? Timirah: [00:30:18] Absolutely. Absolutely. You know, and that's inside or outside of a function. Like it, even if it's not within a function, You know, you just have this, this value that you have when the view first loads, you know, you want this to like, okay, no, I need to know, like, I, I need to protect this and I need to know, what is this going to, What does this output? And I need, and I like, I need to know it it's going to check for it. making sure that the app does not, crash based off of, you know, this, this one value. One of the things, one of the examples I give is like, like an Instagram situation, right. Where, if you are on Instagram, then nine times out of 10, you have an Instagram account. but you might have zero followers. so the number of followers, you know, you might have a situation where it might be zero or it might be nil. And, but that doesn't mean that you're, you know, Instagram should not crash because you don't have any followers or you don't have any, pictures. Okay. that's, you know, IE the bots, you know, the bots live because (laughs) that that situation can exist. You don't have to have any pictures on Instagram. You don't have to have any followers. Right. So it can be zero. It can be nil. Instagram should not crash because of that. The page, your profile should also, load as well. So those are things that should be protected. Those values, should always be protected. It may, they may or may not be there, but that page does exist. So yeah, that's the example that I always like to give as well. Jeremy: [00:31:55] When someone is, learning Swift and figuring out how to build iOS applications and you go onto Google and you start searching for tutorials and things like that. Are there certain features or certain ways of building applications that they're going to find that they should avoid? Because there's, there's newer ways of doing it now. Timirah: [00:32:20] That is a great question, huh... Things that are old, maybe outdated best practices when it comes to Swift... Hmm... one thing I will say about Swift is Swift is constantly, like, I probably said this already, but constantly evolving, and a lot. Yeah. And that is due to Swift being open source, the community, of maintainers and contributors around Swift. They're very passionate. They're very progressive and aggressive when it comes to. and when I say aggressive in the, in the dearest way, And so every so often, there's like every couple of months you see we're on Swift. So Swift is six years old. we're on Swift 5.3. Okay. so let's say that could be what a version a year? but then like you have versions in between that, right? So like, Oh, this is 1.2. This is 2.5. we're constantly being, upgraded and updated. So let's say if you're looking into Swift development, you're like, okay, I want to do iOS development. Okay. Here's the Swift tutorial. If it's from 2015 and this is like Swift 3, it's no good. It's no good at all. And you know, Apple does a great job of keeping up, right? So, let's say, okay. XCode that the latest version out in the right now of XCode, is XCode 12, XCode 12. you have to use. I believe no later, I don't think it's compatible with a version later than Swift 4. but certainly nothing under that. And then even like the iOS version, like, okay. iOS, 13 and up, you know, so, and we're on iOS 14. So everything is constantly like bumping you up in 4. If you come across like some books, literally an entire book of Swift. In 2014? No good. 2015, no good. 2016. No good. You want to look for resources that are the most updated, as you can possibly find, no later than 2019. So if you come across a tutorial, a course, a book. Whatever, look at that date because that's probably the best, the best thing you know, when it comes to being current, making sure that you're using a version of Swift that's actually compatible with this version of XCode. and if not XCode will scream at you, (laughs) it will not compile. And yeah, you want to stay, stay current, stay forward. and stay progressive when it comes to Swift, because in six years Swift has come like it's grown so fast risen, so fast, and it's grown so much in terms of like the community and, what the language is. So if you want to learn more. I would say the best resources would be, a good friend of mine, Paul Hudson. He does a great, I mean, he's, he's excellent. When it comes to, updating, Swift content, keeping updated Swift content. He's written so many books on Swift and he's, he's like a machine when it comes to keeping things updated, you know, he doesn't let any really any of his content go, outdated. so he continues to update his books, update his courses, update his, website. He has a website called Hacking with Swift that's the brand. so you can go to hackingwithswift.com and you can check out all of this like cool stuff, for iOS development, whatever you want to accomplish with iOS development. And learning Swift in general, like hacking with swift is like Swift world. Okay. And you want to stay in tune with swift.org as well, because that's where you're going to get all the updates from the community, on Swift the language and the progress itself. , and what's new. I believe the latest thing I've seen on there is the update that, Swift is now available on windows. So you can, install Swift on your Windows machine and, you know, basically build applications with Swift. So, and again, that's attributed to the community behind that. So you can go to swift.org to learn more about that. Learn about Swift progress, learn how to get started. Another resource is raywenderlich.com Ray Wenderlich was such a powerful resource to me coming up in my early days, learning iOS development with this very universally like relatable, fun content, blog posts, books, and then they just moved into video content as well. Great stuff it's always updated, always updated, you know, whenever there's a new Swift framework or a Swift, whatever, iOS, tooling, whatever, like it's on raywenderlich.com. And of course me, you know (laughs) huge, huge advocate. I've been working, recently with, O'Reilly, to really provide more Swift presence and content on the platform on their online learning platform. I've had a few, live trainings on their platform and they're really fun. if you want to spend a couple of hours, you know, with a live hands-on training on getting started with Swift, and getting your hands dirty with iOS development in itself, you know, finding out all the nooks and crannies of XCode and you know, all the cool things that you can do beyond iOS development with Swift. I, I do a training with them every so often, and we're working on doing more things with them, hopefully, doing more things with Swift UI, you know, yeah. So there's a lot of, there's more content out there than there, there has been ever before, But you want to make sure that you check that date on those, on those things so that you're learning the right things and, you're, you're on the right path. Jeremy: [00:38:41] That seems really, different for Swift than other languages where you could bring up an article about Ruby that's four or five years old. And a lot of it is still going to apply, but it sounds like Swift it's like every year it's like, Oh, this isn't, we don't do it this way anymore. Timirah: [00:38:58] Yeah. Yeah. Absolutely. Because Swift is still like, you know, finding its groove, right. You know, Swift, despite, you know, the, the stability and all this stuff, I'll, you know, I'm hyping it up, but Swift is still young. Like, you know, six years old is not a long time. And for it to be as progressive as it is, it's weird in itself. I, you know, I would have expected it to be, I don't know, anywhere between 6 to 10 years in, for iOS development before it moved on to like, Oh, now we're doing this and we're on Windows and use Swift with Linux, like, Yeah. Like before it spread its wings like that, but it just kinda took off. And the popularity was just like a wildfire. So, yeah, it's a really young language. So it's still finding its groove and that is one of the cons too, just like people like, Oh, it's constantly changing it's exhausting to keep up, but it keeps you on your toes. Like any language, like you should always be, you know, on your toes, but there are a lot more like, You know, more, the widely used, traditionally accepted languages, Javascript you only have to worry about new frameworks, you know, not so much changes to the language itself (laughs). So yeah, that is like a downside. Some people are like, (sigh) You know, Oh, we don't do it like this anymore. but yeah, like XCode and in the Swift compiler yourself, it does a great job of reminding you, Hey, it doesn't just tell you, like, this is wrong. The, the warnings let you know, what the alternative to whatever that is that you're, you're doing. So say, Hey, that, that was done in, in Swift 3, you, you probably meant X, Y, and Z. So it's, it's very good at inference when it comes to that. Jeremy: [00:40:45] So I guess that means if you have, a application you've been building over the past few years, as Swift is making these changes, XCode could actually tell you, here's all the things that you need to update for the new version of Swift. And I'm not sure is it where it'll make the changes for you or it just kind of tells you, and then you, you still go ahead? Timirah: [00:41:05] No, it tells it, it lets you know, it was like, no, excuse me. (laughs) No, _you_ do it. So those are things that it'll say, Hey, it'll let you know. It'll let you know before, before you run it. say, Hey, like, no, like you need to change this. if you have a, an application you built, like with Swift 3 and you, you open it in the new version of XCode and XCode will say, Oh, okay. All right, it'll give you, like, it'll say, not necessarily, errors that it won't run, but it, some things will be buggy. Right. So it'll allow you to run it because it is valid code, right? But it will say like this is buggy, this is buggy. This, it has been deprecated so that that's no longer there. So, and a lot of things that are deprecated is there because now, a lot of those things are built in, so it's not, it doesn't have to be, programmed to say: Hey, certain functions are now like built in to the compiler itself. So it's like, you know, you don't have to worry about this. You don't have to worry about that. Like, That's automatic now, when it comes to, you know, Swift 5, so you don't have to worry about these things. so yeah, they, it, it does a great job of letting you know that. So it'll run, it'll probably be, you probably have some missing loops here and there. But it will definitely give you those, yellow warnings and suggestions like, Hey, you should change this, this, this, this, this, this, and I do believe XCode has an option. Where you can, convert your code to it. It'll say it might pick it up, pick the whole thing up and just say, Hey, they, give you an alert and say, Hey, this is Swift 3. We noticed. And would you like to convert this to Swift 5? And, there's an aspect where it can convert. Like I've seen. up to like 90, at least 90% of your code, can, can be converted over. so yeah, like sometimes it'll pop up, it'll just pop up and I'll ask you, like, should you convert this? Okay. If you should do like, press this button, if you want, if you want, want this to be converted to the latest version of Swift, then you have to go back and kind of fix some things right. Yeah. So, you know, like I said, XCode does a great job of, being that supervisor and then like, the compiler itself is like the principal, like, you know, so, so yeah. Yeah. Jeremy: [00:43:29] Yeah, I think Swift is, is really an, an, a unique place where you were talking about how you think it's finding its groove and it's getting to make all these changes hopefully for the better. Right. And I, I feel like. Usually when a language is, is doing that. It's when not a lot of people are using it yet. So you're making all these changes, but it's okay because, there aren't people with apps that hundreds of millions of people are using. but Swift is in this unique space where, they're making all these changes, but because that is the language for iOS. You have, you know, millions of developers that are, you know, hitting the language hard and shipping apps. And, so it's just a really unique position I think, for Swift to be in. Timirah: [00:44:16] Yes, a thousand percent, a thousand percent. Jeremy: [00:44:20] Cool. So I think that's a good place to start wrapping up. Is there, anything else you wanted to mention or want to talk about where people can check out what you're up to? Timirah: [00:44:30] I, first of all, I'm so glad that we, finally got this podcast in, for those of you listening, we've been trying to do this for like three months, trying to get this scheduled for three months. My life has been super busy, but I wanted to make sure that we, you know, got this chance to really talk and, and chat. You know, Jeremy you've been amazing I love the podcast. so it was so important to me that, I came on here and, you know, I had this awesome conversation with you. about me, let's see. you can find me on Twitter at @TamirahJ and like, Oh, what's, what's my Twitter. Oh, my name, TamirahJ you can find me on Twitter. Follow me on Twitter. mention me, I'll say, hi, I'll follow back. I do all of that. If you have any questions about, you know, getting started in iOS development, if you're starting your journey, if you want to learn more about, getting a job in iOS development, what the interview process is like. If you wanna be involved in some of the things that I'm doing in terms of teaching iOS development or teaching with Swift, please feel free to reach out, I do have a Coursera course that is coming out very, very soon. And yeah, I did this thing, back when I first announced there where I was just like, Oh, can you guess what language I'm going to do the course on because it's not Swift. But it's been a while. Coursera has, they've changed the, the release date a couple of times. So I said, you know what, I'm just going to tell everyone, you know, it is actually on flutter. Okay. and Flutter, is a a cross platform framework, that utilizes Dart, which is a language that is, that was created by Google. that is something that I've gotten into within the past, like year and a half Flutter is like super fun. You know, and it took me away from like my native, basically my native patterns and, got into cross-platform. And it's really fun. It's really great for, people who are, beginners and beginning in the cross-platform realm. And when we talk about how performance like fast Swift is, of course there's some loss when it comes to cross-platform, in terms of, performance and in terms of, tooling, native tooling. But flutter is probably the closest that I've seen to getting that full on experience, getting that reactive experience and getting a lot of the freedom when it comes to, even down to like animations or things like that. Just a lot of the cool things that you would be able to take advantage of with, native mobile development. Flutter, really makes up for a lot of that with, cross-platform development in the cross-platform, space. So flutter is definitely the best bet. I'm going to be doing a Coursera course or a series of coursera courses on Flutter. So look out for that. Shout out to Technigal LA, which is my meetup turned nonprofit for women of all ages who want to thrive in STEM, anything from like educational opportunities, employment opportunities, networking opportunities, we're doing it all virtually, but, it's all free. So if you want to learn more about that, or if you want to get involved, do you want to come, you know talk to some women, give, you know, teach a workshop or if you want to know where some other talented women are in STEM, you know, feel free to reach out. We are Technigal, so that's technical, but with a G. So, T E C H N I G A L and then LA, because I am based in Los Angeles. So, yeah, so you can check us out on, Instagram and or Twitter as well. or you can check us out on, meetup. So yeah. Thank you so much. Jeremy, like this is, this is a lot of fun and I'm glad that we finally, finally, got a chance to, to get it in. Yeah, absolutely. Jeremy: [00:48:26] Yeah, well, that means things are, things are happening. That's good. Timirah: [00:48:29] Yeah. Yeah. It's a good thing to be busy. Yeah. Jeremy: [00:48:35] I hope you enjoyed the conversation with Tamirah. If you want to learn more about iOS development, we've got notes to all the resources we talked about in the show notes. The music in this episode was by Crystal Cola. Thanks again for listening.
undefined
Nov 18, 2020 • 1h 2min

Object Relational Mappers with Julie Lerman

Related Links:@julielermanThe Data Farm (Julie's blog)Entity FrameworkDapperAutomapperPluralsight CoursesThis episode was originally posted on Software Engineering Radio.TranscriptYou can help edit this transcript on GitHub.Jeremy: [00:00:00] Today I'm talking to Julie Lerman. She's a frequent conference speaker, an independent consultant and a Pluralsight course author. Today we're going to talk about her experience working on ORMs and Entity Framework. Julie, welcome to software engineering radio.Julie: [00:00:16] Hey, Jeremy. Thanks so much for having me.Jeremy: [00:00:18] For those who aren't familiar with ORMs what are they and why do we need them?Julie: [00:00:23] The first thing is the definition so ORM stands for object relational mapper. It's a software idea and implementation in many aspects for taking the concepts that we describe as classes and transforming them into the structure of a relational database. So you can get the data in and out easily.So the ORM takes care of that mapping and the transformation. So you don't have to worry about writing SQL. You don't have to worry about getting structured results from a database that are in rows and columns and then transforming them back into object instances. So the ORMs do that for you.Jeremy: [00:01:15] Some of the most popular languages used are object oriented. Why do we continue to store our data in a relational database when it's so different from our application code?Julie: [00:01:26] There's certainly a lot of reasons for relational databases compared to document databases, object databases. Relational databases for their structure and normalization and tuning, and it's different kinds of storage needs I think than when you're storing documents. Sometimes I think more about getting the data back out. Finding data. It's one thing to be able to store it in a way that makes sense. But then when you need to pull data together and create relationships that aren't naturally in those documents unless you have prepared and designed something like a document database for every possible scenario.There's also the whole concept of event sourcing where if you're just storing the events then you can actually re-persist that data in different structures so that it's easier for the various ways you might access it. Also, relational databases have been around for a long time and it's interesting that even with the popularity and extreme power of these other kinds of nonrelational databases, the NoSQL databases, relational databases are still, I can't say the source off the top of my head but I've used them in conference slides. They update their data monthly, but I can see these graphs where it shows that it was still about 70% of systems out there are using relational databases and in a lot of cases, so much legacy data, too, right. That's all going to be in relational databases.Jeremy: [00:03:18] Relational databases, they give us the ability to request just the data we need that matches a certain condition or we can combine different parts of our data and get just a specific thing that we need.If we were dealing with a bunch of arrays of objects or documents it might be harder to query for a specific thing does that sort of match...?Julie: [00:03:45] Yeah, especially that unplanned data access and. Another interesting thing. And it's one of those things that I think you have to like, stop and think about it, to realize the truth in it, which is that most systems, most of the work you're doing the database is reading the database much more than creating and storing data.That's the bigger problem. Getting the data back out and having the flexibility to do weird things whereas something like a document database depending on how you split it up and keys and everything, it's just more designed. A relational database is a little easier to get at that weird stuff.Jeremy: [00:04:37] You mentioned ad hoc queries. So with a relational database you can store all your information and then you can decide later what queries or information you want to get out, whereas you had hinted at with a document database you have to think upfront about what queries you want to make. Could you elaborate a little bit on that?Julie: [00:05:02] I don't want to be generalizing too much, because there's all kinds of other factors that might come in especially when talking about huge amounts of data for like machine learning and things like that. There's pros and cons to both of them.And like I mentioned with event sourcing ways to have your cake and eat it too. When I look for example at documentation for Azure CosmosDB, that's the document database that I have the most familiarity with. One of the most important documents is that upfront design and that design bleeds into how you're modeling your data in your application so that you can store it fluidly into the document database. So it's really, really important. Now I don't want to conflate this with the beauty of a system like document database, how that aligns so much more nicely for persistence.And most of your getting data in and getting data out when we're talking about domain driven design. Where you don't want to worry about your persistence, just focus on the domain and then we can figure out how to get the data in and out right? We've got patterns for that.But it is interesting if something like a document database more naturally aligns with your objects right? So that flows nicely in. However, it's still really important to be considerate of how you're breaking up that data in the way you store it. What your graphs are going to be. Maybe it's easy to get it in this way. But then what about when you're getting it out? What about different perspectives that you have? I want to be really careful about, comparing and contrasting the document DB versus relational DB. I want to stop at some level of that because there's always going to be debate and argument about that.However one of the nice things about an ORM is it does take away some of that pain. That we normally would have when we're writing applications and writing that data access layer where we have to say, okay, I want this. Now I have to figure out the SQL statement, or call a stored procedure, or however I'm going to do it.Whether I'm getting data in or getting data out and then I have to transform my objects into the SQL right? Or I get data back. And then I have to read through that data and instantiate objects and stuff, all the values into those objects. That's the stuff that the ORM can...Repetitive. Like Bull (bleep). Right. Writing all that stuff. So that's what the ORMs do. I think the next big question that comes up then is yeah, but my SQL is always going to be better than some generic thing that an ORM is going to build for me. Right? The general advice for that is for many cases what the ORM doesthe performance will be just perfectly fine. And then there's going to be the cases where the performance isn't good. So then you can use your own stored procedures, use your views.But a lot of the ORMs, certainly the one I focus on, which is Microsoft's Entity Framework, has the ability to also interact with stored procedures and views and do mappings to views and things like that. So people don't realize that, right. But maybe 80% of the SQL that entity framework is going to write whether it's for queries or for pushing data into the database. Maybe 80% of that is perfectly fine. And then 20% of that, do performance testing, go yeah. I can do better. I can do better and fix those up.Jeremy: [00:09:27] I want to step back a little and walk through some of the features of ORMs that you had mentioned. Uh, one of them was not necessarily having to write SQL so you have some alternative query language that the ORM is providing. So what are some of the benefits of, that?Like why would somebody not just jump straight to SQL and would instead use this SQL-like query language?Julie: [00:09:57] well, Again, I'd have to kind of focus on .NET and I know you do a little bit of .NET programming so you probably are familiar with LINQ the query language that's built into .NET, into C-sharp and the .NET languages. It's a much, simpler expression than SQL, and it's much more code like and if you were writing a SQL string to put in to your application, we want to use stored procedures and views, but let's say you're just writing raw SQL, right? You're writing a string. And you're crossing your fingers that you got it right, right? That there's no typos, but also that you've written it correctly, but if you're using something like LINQ, which is part of the language, then you've got all the advantages of the language syntax and IntelliSense and here's my object what are my methods that are available all popping up. You don't have to worry about magic strings. I think that's really key and you don't have to know SQL concepts really. So with entity framework at the very beginning the team actually came out of Microsoft research. The team that had created that started with a SQL like language to do it. But then LINQ was also created at Microsoft by the C sharp team. And then, LINQ was originally created for objects. So you can use that for writing queries just across your objects. I wouldn't say LINQ was evolved, but a LINQ version that applied to entity framework was created.And there was also LINQ to SQL which was a much lighter weight ORM that was also created within Microsoft, so you're using almost the same LINQ. A lot of things are shared some are more specific to entity framework. There's another benefit. I use LINQ for all kinds of things in my code, not just for the database interaction with Entity Framework, I use LINQ just when I'm working with arrays. Working with data in objects in memory. So I've also got this common syntax. I already know how to use it. There's just some special features of it that are specific to using entity framework.So there's all kinds of benefits and the best benefit was it enabled me to essentially forget SQL. I forget how to write T-SQL. If I have to write raw T-SQL, I have to Google it after all these years of coding but I don't mind.Jeremy: [00:12:39] With LINQ this was a query language that was already built into C# and also VB.NET. All of the languages that run on the .NET runtime, and so the developer may already be familiar with this query language. And they can apply it to making queries to a database rather than having to learn SQL which would be an entirely new language to the developer.Julie: [00:13:19] Yes. And not just that, but I don't have to learn T-SQL (Microsoft SQL Server) and O-SQL (Oracle) and P-SQL (Postgres). There's all kinds of different database providers for entity framework. And the same for other ORMs. I feel a little bad just focusing on EF, but that's my area of expertise.But the providers are the ones that do that transformation. So entity framework iterates over the query a few times and creates a query tree. And then the query tree is passed on to the provider, which then does the rest of the transformation into its particular SQL. So, I actually go back and forth between SQL server, SQLite and PostgreSQL. SQL server's what I'm most familiar with but I can use SQLite and Postgres and there's a lot of generic, common stuff right? So as a matter of fact the way LINQ for entity framework is built. That's just one thing and then the providers figure it out. So if you need something that's really specific to SQL server that doesn't work in Postgres or something in Postgres that doesn't work in SQLite there might be an extension that lets you tap into that particular thing. Or call a stored procedure. Except you wouldn't do that in SQLite because they don't have stored procedures. Interestingly tapping back to talking about document databases. There is a provider for Azure CosmosDB. So we've got an object, relational mapper. That maps to a non-relational database. So the reason that exists is because there was a lot of people that said I really am familiar with entity framework. I've been using it for years. We're going to be using CosmosDB. Can I just use that? So that exists.And Azure has a SQL I'll just say SQL language for accessing but there's other APIs that are used also it's not all SQL, but because it has the SQL construct the providers kind of already know how to build SQL.So it's just, we'll build the SQL that Azure works. So it's not solving the problem that we have with relational databases, which is that we've got this rows and columns structure and we need to do this translation all the time. The problem it was solving that provider is solving by linking entity framework and the document database is I already really know how to use entity framework.And here I am just, I'm just doing some queries and persisting some data. Why do I have to learn a whole new API?Jeremy: [00:16:27] You're saying that even with a document store, you might be able to use the same query language that's built into your ORM. With a relational database you have joins and things like that allow you to connect all your tables together and do all these different types of queries.Whereas with a document database, I usually associate those with not having joins. Is that also the case with this document database that you're referring to with, with CosmosDB?Julie: [00:17:02] I've written only tiny little minimal SQL for accessing it used to be called document DB, cosmos DB directly. I can't off the top of my head remember how that SQL works that out. If you've got to pull multiple documents together. I don't know, but what's nice is that in your code you don't have to worry about that stuff. Oh God, inner joins and outer joins and all like, Oh my God, I can't, I can't write that stuff anyway. So LINQ makes sense to me. I know it really well. So I don't have to worry about that.Jeremy: [00:17:43] It sounds like at a high level, you get this query language that's built in to the language. Is more similar to the other code you would write. And you also get these compile time guarantees, auto-completion, what things you're allowed to call versus SQL, where, like you were saying, it's you have this giant string you hope that you got right. And don't really know until you query. So there's a lot of benefits with an ORM. You were also talking about how when you don't have an ORM you have to write your own mapping between after you've done your SQL query, you get some data that's in the form of just a set of records. And you have to convert those into your objects that are native to your application. And so basically the ORM is taking care of a lot of the you could say repetitive or busy work that you would normally have to do.Julie: [00:18:47] To be fair, there are mappers. There are programs that are mappers or APIs that are mappers like auto mapper for .NET. So the connection string, like I don't have to worry about creating connections and all that kind of stuff, either with the ORM, but something like auto mapper, you can have a data model that really matches your database schema, and you can have a domain model that matches the business problem you're trying to solve. And then you can use something like auto mapper to say, okay, like here's what customer looks like in my application. Here's what customer looks like in this other data model. That's going to go to my data, but that map matches my database really easily.And auto mapper let's you say this goes with this, this goes with that blah, blah, blah, blah, blah. So yeah. that stuff does exist. And, and there are also other lighter much... entity framework is a big thing. It does a lot. There are certainly lighter weight ORMs that they don't do as much, of that take away as much of that work.However, if you need to squeak every little millisecond out of your performance, there's an open source ORM called dapper and it was built by the team that writes stack overflow because they definitely needed to squeak every millisecond of performance out of their data access. There was no question about that. So with dapper, it takes care of some of the stuff, some of that creating the connections and materializing objects for you, but you still write your own SQL because they needed to tightly control the SQL to make sure. Because it's more direct it doesn't have lots of things happening in between that entity framework does. And there's lots of dials in entity framework where you can impact how much work it's doing and how much resources it's using.Jeremy: [00:21:09] So my understanding with, whether you would use entity framework versus one of these smaller ORMs, or more basic mapper is, it sounded like performance. And so it sounds like maybe you would default to using something like entity framework and then only dropping down to something like dapper, if you were having performance issues. Is, is that where your perspective comes from?Julie: [00:21:41] Not sure if I would wait until I'm having performance issues. I think people sometimes will make that decision in advance, but an interesting plan of attack so that you can benefit from the simplicity of using entity framework. So you don't have to write the SQL, et cetera, is for example, using a CQRS pattern.So you're separating the commands and the queries separating pushing data into the database and the queries. Cause it's usually the queries where you need to really, really focus on the performance and worry about the performance. Some people will use CQRS and let entity framework take care of all the writing and let dapper take care of all the readings.So they're designing their application. So things are not, not the business logic, but the layer of the application that's taking care of the persistence. So designing it so that those are completely separate. So, Oh, I'm doing writing. So we're spinning up dapper and we're going to, or actually the other way around, I'm writing. So I'm going to spin up an entity framework context and say, go ahead and save changes or I'm doing a query and I'm writing my own SQL because I trust it. So now I'm going to spin up dapper and execute that SQL, let it create the create my objects for me.Jeremy: [00:23:10] It sounds like it might be better to decide that up front figuring out, what is the amount of load I'm going to get in my application. What are the types of queries I'm going to be using for writing and reading data? And then from there, figuring out, should I use dapper? Should I use entity framework?Julie: [00:23:35] Yeah, and there's some amount of testing and exploration, you would need to do proof of concept stuff. I think you might want to do first because I think entity framework query performance surprises people a lot. Like it's a lot better than people presume it's going to be. So yeah, I would definitely do some proof of concept before going that route cause it's certainly a lot easier to just use one thing all the way through.However, if you've got things, applying separation of concerns and broken up, then it's not as traumatic to say for this bit... It is running a little slowly and, let's just use dapper. But if it's separated out, then it's easier to just say, yeah, let's just change over to that bit in our infrastructure.Jeremy: [00:24:30] Are there workloads in a general sense that you would say that entity framework performs worse than others? For example, An app with a heavy write workload. Would that be an issue in entity framework? Are there examples of specific types of applications?Julie: [00:24:52] I don't know if it's workload, compared more to structure you know schema. Um, or what it is you're trying to get at. you know, sometimes it's, how your model is designed, right? So there are just some places that any frameworks not going to, just not going to be able to do something as well as your own SQL. I had an example, this was years ago. So entity framework was not in its infancy, but it was like EF4 and I had a client who had gone down kind of a not great path of implementing entity framework in their software that's used across a big enterprise. They had created one big, huge model for everybody to use.So that one model was responsible for no matter what you were doing, you had to use that model. Just pulling that model into memory was a lot of work for entity framework, but it created a lot of problems for the teams who were stuck using that. Because of the way they had set that up, they had this common query that would only return one piece, one row. They were looking for one row, but because of the way they had designed this model entity framework had to go through all kinds of hoops to create the SQL, to get at this one piece of data to, to write this query. And it was taking, it was in Oracle taking two minutes to execute. And they're like, what do we do? And I looked at it and I said, Just write a stored procedure. It's too late to change this model because everything's dependent on it.So just write a stored procedure and the stored procedure took nine milliseconds. So like, okay we can pay you for the whole three days. You can go home now if you want, that was worth it.Jeremy: [00:26:46] And so it sounds like that's not so much the choice of not using entity framework as it is-- you're still using entity framework, but you're using it to make a request to a stored procedure in the database. Similarly to how you could probably send raw SQL from entity framework.Julie: [00:27:05] Yes. So you can still take advantage of that. So we're still using entity framework and saying, Hey, I don't want you to write the SQL, just execute this for me. And so that's still taking advantage of the fact that it knows it's SQL Server. It knows what the connection string is, et cetera.It's not writing the query now there's we can take advantage of views in the same way and can actually map our objects to views and have it so we're still executing the view. And it's gonna still materialize the data for us, which is really nice.Jeremy: [00:27:44] Yeah. So when you were talking about deciding whether or not to use entity framework, when you use entity framework, you could still be using a mix of the LINQ query language. Maybe raw SQL commands. Like you said, views, stored procedures. It's not this binary choice where you're saying, I'm going to write everything myself or I'm going to leave everything to Entity Framework.Julie: [00:28:11] Absolutely. I mean that's the way we approach programming overall. Right? So I've got all of these wonderful arrows in my quiver to choose from. And, I'm not stuck with one. I just leverage using my knowledge and my experience. I'm making a decision, Hey, right tool for the job right. I guess that's what we've been saying for decades, not just programming, using the right tool for the job, but just because you want to use Entity Framework, doesn't mean you have to use it all the way through, but I think that's like, again using, using C# for this and maybe I'll use NodeJS or maybe I've got some really complex mathematical thing to do, for this particular thing I'll use F# or some other functional programming language.Jeremy: [00:29:08] So we did a show back in 2007 about ORMs and I know that you've had a lot of experience both working with ORMs and probably working with applications without them just writing SQL directly. And so I would imagine there's been a lot of changes in the last decade, or maybe even couple decades.What are some of the big things that you think have changed with ORMs in the last, 10 or 20 years?Julie: [00:29:40] I think something we've been talking about, which is [that]NoSQL databases have reduced the need for ORMs right? Just slice that right off the top. There's so many scenarios where people don't need relational databases to solve that. So this is interesting, right? Because I'm not sure that the answer is. About what's happened with the ORMs... I think the answer is more what's evolved with the way we persist data and the way we perceive data. Machine learning, right? The idea of event sourcing, right? Talk about right tool for the job. Like here's my data, but for this scenario, it's going to be more effective and efficient for me if that data is stored in relational for this scenario, it'd be more effective if it's stored as documents, right? For this scenario, it'd be more effective if it's text, like just a text file. Oh, I forgot XML. Just kidding or JSON. Right? I think that's what's impacted ORMs [more] than the ORMs themselves. And entity framework is still going strong and evolving, and very, very wide use, in the .NET space. Some of the ones that were more popular are not really around anymore or haven't been updated in awhile. Dapper gets its love of course from the stack overflow team. So I really think it's more about the change in how we store data that's just reduced people's need for ORMs. But still whatever that resource was 70% of the applications are still using relational.And then there's the people, like, no matter what, like we will never use an ORM. We will always write our own SQL. They have beautiful stored procedures and views and finely tuned databases. And they'll write the SQL for you these people. Just don't ask me to do it cause I'll probably never write SQL that's as good as entity framework with its algorithms and command trees can figure out on my behalf but there are plenty of database professionals who can definitely do that.Jeremy: [00:32:37] For somebody who is working with a relational database and let's say they worked with entity framework or they worked with hibernate, or any number of ORMs a decade ago.. Is there anything that's changed where you would say, Hey, this is different. Maybe you should give this a second look?Julie: [00:32:59] I apologize for not being able to speak to hibernate, like how that has evolved, but when Entity Framework first came out and this was very different than NHibernate, the .NET version of hibernate, Entity framework did not have any appreciation or respect for our objects.It was really designed from the perspective of storing relational data and it wasn't testable. It just didn't have any patterns. Interestingly, I spent a lot of time researching how entity framework aligns with good model design based on how we design domain driven design aggregates, and some of the, things that we put in place to protect our entities from people doing bad things with them. So Entity Framework at the beginning did not give us any way to do that. It didn't just lock your data access in to entity framework, the whole framework.It locked your entire application in. All your entities and your classes had to be completely tied to entity framework in order for entity framework... Now that's all decoupled, right? So entity framework also we had this big visual modeler that in order to create the model, you had to start with an existing database.Now it's like, look, here are my classes I don't want to design a database. I'm designing software. Right. I'm solving business problems. I have my domain model. So you build your domain model and then say, Hey, entity framework. Here's my domain model. Here's my database. Oh. And by the way, I know that by default, you're going to want this to go there and this to go there.But I need to tell you there's a few things I want you to do a little differently. So we have all of that and we tell.. entity framework has nothing to do with our domain logic. So it's gotten better and better and more respectful of our code, our domain logic, and our domain business logic and code, and staying out of the way.And that's been a big deal for me as somebody focused on domain driven design, which is. As I'm building my software I do not want to think about my data persistence, right. Has nothing to do with my domain. I like to joke unless I'm building the next Dropbox competitor right? So data persistence has nothing to do with my domain.So I. Look to see how well as each iteration of entity framework comes out. I look and do some experiments and research to see how well it's defaults understand or map correctly, like get my data in and out of my database without me having to tweak Entity framework's mappings, but more importantly, without me having to do anything to my domain classes in order to satisfy entity framework and with each iteration it's gotten better and better and better, which is nice. Now, one bit of pushback I had in the early days when it was talking about entity framework and DDD was well, you're mapping your domain directly to the database using entity framework. You shouldn't do that. You should have a data model and map that, but what's interesting is entity framework becomes the data model, right? The mappings of the entity framework become the data model.So I'm not doing it directly. That is the data model. So it is solving that problem. It's giving me that separation and there are still some design, some aggregate patterns that, entity framework, for example, won't satisfy in terms of the mappings, or maybe just some problems, some business problems or storage problems that you have. There are some times when it just makes sense to have something in-between even if the entity framework is still in-between that, like I have a specific data model, cause it just handles it differently. So you don't have to change how your, your domain I'm very passionate about that. Don't change the domain to make entity framework happy. Never, never, never, never.Jeremy: [00:37:45] Could you give a specific example of before the objects you would have to create in order to satisfy entity framework? What were you putting into your objects that you didn't want to put in?Julie: [00:38:01] That I didn't want to, I don't want to, well, one of the things that Entity Frameworks mappings could not handle is if you want it to completely isolate or encapsulate a collection in your entity. In your class, if you want to completely encapsulate the collection entity framework couldn't see it. Entity framework only understood for example, collections, but maybe you want to make it so that you can only iterate through it and enumerate through it. But if it was an IEnumerable, which is so that's what you would do, like in your class, you would say, Oh, I'm going to make it an IEnumerable so nobody can just add things to it and go around my business logic, which is, if you want to add something, I have to go over to this business logic to make sure all your invariants are satisfied or whatever else needs to happen.So entity framework literally could not see that. And so, yeah. We struggled with, finding ways to work around that adding stuff, or just giving it up and saying, okay, I'm exposing this here, please. Don't ever use it that way please. But you know, you have to tell the whole team don't ever use it that way.Right. But with Entity Framework Core 3 they changed that. So you can write your class the way you want to, and you can use the patterns that we use to encapsulate collections and protect them from abuse, misuse, and still entity framework is able to discover them and understand how to do all the work that it needs to do.So that's just an example.Jeremy: [00:39:50] What you wanted to do in your domain object is make this collection be an IEnumerable, which makes it so that the user of that class is not allowed to add items to it?Julie: [00:40:03] Right. Developers using it. Add or removeJeremy: [00:40:08] right. Add or remove. And, uh, so you wanted to be able to control the API for the developers. Say that you want to receive this object and you can, you can see what's in it, but you can't add or remove from it.Julie: [00:40:24] Well, you can, but not directly through that property. Right. You'd have to use a method that has the controls. I refer to it as a ADD DD ADD driven design, right? Because I am a control freak. I want to control how anybody's using my API.Jeremy: [00:40:44] You wanted to control how people use the API, but if you did that within your object, then entity framework would look at that. IEnumerable and say, I don't know what this is, so I can't map this.Julie: [00:40:56] It just completely ignores its existence.So that was one evolution of Entity Framework that you can do that now. And all kinds of other things.Jeremy: [00:41:09] Are there any other specific examples you can think of, of things that used to be either difficult to do or might frustrate people with entity framework [that] are addressed now?Julie: [00:41:20] Along those same lines of encapsulating properties, right? So, and encapsulated collections was really, really hard. Like it wasn't hard. It was impossible. But if you want it to either encapsulate properties or not even expose properties, you didn't want to expose them all at all and you just wanted to have private fields.You couldn't do anything like that either, but now with the, more modern, EF Core, you could literally have an object, a type defined with fields, no properties exposing those fields outside of the API. But you can still persist data. Entity framework is still aware of them or depending on how locked down, like if you made them totally private you'll have to tweak the configuration in Entity Framework, you have to tell Entity Framework, Hey, this field here, I want you to pay attention to it. Right. So there's some conventions where it will automatically recognize it. There are some ways of designing in your type that Entity Framework won't see it, but you can override that and tell entity framework about that. So I think that's a really interesting thing, because a lot of scenarios where people want to design their objects that way. And you know, again, just so much of that is about protecting your protecting your API from misuse or abuse but not putting the onus on the users of your API, right.The API will let you do it. You're allowed to do and it won't let you do what you're not allowed to do.Jeremy: [00:43:04] You had mentioned earlier another big difference is that it used to be, that you would create your database by writing SQL, and then you would point entity framework to that to generate your models?Julie: [00:43:17] Oh, yes. The very first iteration, that was the only way it knew how to build a model. Oh, wait, there was a way to build the model, a data model in the designer. That's right. But then everything's dependent on that model or on the database instead of no, I want to write my domain logic and then, now Entity Framework. Not everybody has a brand new project, right. But say you get a brand brand new project and there is no existing database. So you can ask entity framework to generate the database based on the combination of its assumptions from reading your types. And extra configurations you add to entity framework and create the database.So I hear all the DBAs going. Yeah. Oh no, no, no, no, no. That's not how it works. But this is the simple path. Like, so first you can have it create the database and then as you make changes to your domain Entity Framework has this whole migrations API that can figure out what has changed about your domain and then, figure out what SQL needs to be applied to the database in order to make those changes. So that's, that's one path. A twist on that path of course, is you can create that stuff. Instead of having those migrations, create the database or update the database, you can just have it create SQL and hand it to the professionals and say, this is sorta what I need.If you can do this and then just make it the way you want it to be, just long as the mappings still work out. But the other way is if you do have a greenfield application and you do have a database already, you can reverse engineer, just like we did at the very beginning, but it's a little different.But you can reverse engineer into a data model, which is, your classes. And a context, the entity framework configuration. So the classes are I refer to those as a stake in the ground. Because then you can just take those classes and make them what you want them to be. And then just make sure that you're still configuring the mapping so everything lines up. So there are two ways to go still, but at the beginning, There was this, I forgot even that the designer had, the entity data model designer had a way to design a model visually and then create both the database and the classes from it. But that, that went away really quickly.So it was essentially point to the database, the whole huge database and create one whole huge data model from it. And that's how everything would work now.A) we don't do that. And also we still don't have one big, huge data model. Right. Separation of concerns again. So we have a data model for this area of work and a data model for that area of work. And they might even point back to their own own databases. Right. And start talking about microservices and et cetera.Jeremy: [00:46:35] Yeah. So if I understand correctly, it used to be where you would either use this GUI tool to build out your your entities or your database, or you would point entity framework at an existing database that already had tables andJulie: [00:46:53] And all of that UI stuff is gone. Some third party, providers make some UI stuff, but as far as the entity framework is concerned, you know, Microsoft is concerned. We've got our classes and we've got our database. There's no UI in between.Jeremy: [00:47:12] It sounds like the issue with that approach was that it would generate your domain models, generate your classes, but it would do it in a way that probably had a whole bunch of entity framework specific information in those classes? Julie: [00:47:30] In your classes. Yeah. So much dependencies in your classes. Now, none of that, they started moving away from that in the second version of the entity framework. Which was called EF4 not 2, butit's cause it was aligning with .NET 4, but now we've now we've got EF Core, which came along with .NET Core, which is cross platform and lightweight and open source.Jeremy: [00:48:00] So it sounds like we're moving in the direction from having these domain models. These classes have all this entity framework specific stuff to creating just basic or normal classes. Like the kind of code you would write if you weren't thinking about entity framework or weren't thinking about persistence specific issues.And now with current ORMs, like entity framework core, those are able to map to your database, without, I guess the term could be polluting all your classes.Julie: [00:48:38] And to be fair, entity framework was the evil doer at the beginning of entity framework. Right. hibernate didn't do that, nhibernate didn't do that. They totally respected your domain classes and it was completely separate. There was a big, well, I'll just say a big stink, made, by the people who had been using these kinds of practices already. And were using tools like hibernate and Entity Framework came around and said: Oh, here's the new ORM on the block from Microsoft. Everybody needs to use it. And they looked at, and they're like, But, but, but, you know, because entity framework was inserting itself into its logic into the domain classes.And, the Entity Framework team got a lot of big lessons from the community that was really aware of that. And it wasn't right. I was, it was all new to me. I had never used an ORM before, so it was like, la di da, this is cool. This is nice. And you know, all these other people were like going, this is terrible. Why is Microsoft doing this? I don't understand. So eventually it evolved and I learned so much. I learned so much because of them. I mean, these are really smart people. I have a lot of respect for, so I was like, okay, they obviously know what they're talking about. So I gotta, I gotta figure that out. That's how I ended up on the path of learning about DDD. Thanks to them.Jeremy: [00:50:09] So it sounds like there's been three versions of entity framework. There's the initial version that tried to mix in all these things into your domain models. Then there was the second version which you said was called entity framework 4 that, that fixed a lot of those issues. And then now we have Entity Framework Core.I wonder if you could explain a little bit about. Why a third version was created. And what are the key differencesJulie: [00:50:40] Yeah. So this is, this was when Microsoft also at the same time, it had taken .NET Framework and rewritten it to be cross platform and open source and use modern software practices. And at the same time, the same thing was done with entity framework. They kept many of the same concepts, but they rewrote it because the code base was like 10 years old and also just in old thinking. So they rewrote it from scratch, keeping a lot of the same concepts. So people who were familiar, you know, who've been using Entity Framework they wouldn't be totally thrown for a loop. and so rewrote it with modern software practices, much more, open API in terms of flexibility and open source.And because it was also part of .NET Core cross-platform. So that was huge. And then they've been building on top of that. So Entity Framework, the original Entity Framework. So I think it's interesting how you say there were three. So the first iteration of entity framework, and then this next one where they really, honored separation of concerns.So that was EF4 through EF6 and then they kind of went to a new box with EF Core and served with the EF Core 1 and then 2 and 3 and, no 4. And now it's 5. So skipping a number again. So EF6 is still around because there's millions of applications that are using it. and they're not totally ignoring it.As a matter of fact, they brought EF6 on top of .NET core. So EF6 is now cross-platform, but it's still in so many applications and they've been doing little tweaks to it, but EF Core is where all the work is going now.Jeremy: [00:52:39] And it sounds like Core is not so much a dramatically different API for developers, but is rather an opportunity for them to clean up a lot of technical debt and, provide that cross platform capability, at least initially.Julie: [00:52:57] And enable it to go forward and to do more, be able to achieve more, functionality that people have wanted, but was just literally wasn't possible with the earlier stack.Jeremy: [00:53:12] Do you have any examples of a big thing that people had wanted for a while? They were finally able to do?Julie: [00:53:19] And this is really, really specific now because this is about entity framework with the newest version, EF Core 5 that's coming out. One thing that people have been asking for since the beginning of EF time was when you're eager loading data. So there's a way of bringing related data back in one query.There's actually two ways, but there's one way called a method called include. So including is so that you can bring back graphs of data instead of bringing, you know, go get the customers. Now, go get their orders. And now we need to merge them together in memory in our objects. So the include method says, please get me to customers and their orders and include, transforms that into like a join query or something like that brings back all of that different data and then builds the objects and make sure they're connected to each other in memory.That's a really important thing to be able to do. The problem is because it was all or nothing, whatever relation, whatever related data you were. Okay. Bringing back in that include method, you would just get every single one. So, you know, if you said customer include their orders, there was no way to avoid getting every single one of their orders for the last 267 years.Jeremy: [00:54:42] You're able to do an include, so you're able to join a customer to their orders, but in addition to that also have some kind of filter, like a where.Julie: [00:54:52] On that child data, I'll say quote, you know, I'm doing air quotes, that child data, that related data. So we could never do that before. So we had to come up with other ways to do that.Jeremy: [00:55:04] Yeah. So it was, it was possible before, but the way you did it was maybe complicated or convoluted.Julie: [00:55:12] And it writes good SQL for it too, better than me. Maybe not better than, many of my really good DBA friends, but definitely better than I would.Jeremy: [00:55:25] Probably be better than the majority of developers who know a little bit of SQL, but, uh, are definitely Julie: [00:55:33] Not DBAs yeah. I mean, that's one of the beauties of a good ORM, right?Jeremy: [00:55:39] I guess it, it, um, doesn't quite get you to being a DBA, but it, um, levels the playingJulie: [00:55:46] Well, I can hear my DBA friends cringing at that statement. So I will not agree with that. It doesn't level the playing field, but what it does is it enables many developers to write pretty decent SQL how's that how's that my DBA friends?Jeremy: [00:56:10] I guess that's an interesting question in and of itself is that we're using these tools like ORMs that are generating all this code for us and we as a developer may not understand the intricacies of what it's doing and the SQL it's writing. So I wonder from your perspective, how much should developers know about the queries that are being generated and about SQL in general, when you're working with an ORM? Julie: [00:56:42] I think it's important. If you don't have access to a data professional it's important to at least understand how to profile and how to, recognize where performance could be better and then go seek a professional. But yeah, profiling is really important. So if you've got a team where you do have a DBA somebody who's really good with and understands they can be doing the profiling, right. Capturing it. And there's, you know, whether, if they're using SQL server, they might just want to use SQL server profiler there's all kinds of third party tools.There's some capability built into visual studio. If you're using that, or there, there's all kinds of ways to profile, whether you're profiling the performance across the application, or you want to hone in on just the database activity, right? Because sometimes if say there's a performance problem, is it happening in the database? Is it happening in memory? Is entity framework trying to figure out the SQL? Is it happening in memory after the data has, you know, the database did it really fast? The data's back in memory now, entity framework is chugging along for some reason, having a hard time materializing all these objects and relationships.So it's not just about not always about profiling the database. And there are tools that help with that also.Jeremy: [00:58:18] You have the developers writing their queries in something like entity framework. And you have somebody who is more of an expert, like a DBA who can take a look at how are these things performing. And then if there are issues they can dive down into those queries, tell the developer, Hey, I think you should use a stored procedure here or a view.Julie: [00:58:44] Yeah. Or even kind of expand on that. Unless you're solo or a tiny little shop, this is how you build teams. You've got QA people. You've got testers, you've got, data experts. You've got people who are good at solving certain problems. And everybody works together as a team. I'm such a Libra. Why can't we all get along?Jeremy: [00:59:10] Yeah. And it sounds like it takes care of a lot of the maybe repetitive or tedious parts of the job, but it does not remove the need for, like you said, all of these different parts of your team, whether that's, the DBA who can look at the queries or maybe someone who is more experienced with profiling so they could see if object allocation from the ORM is a problem or all sorts of different, more specific, issues that they could dive into.A lot of these things that an ORM does, you were saying it's about mapping between typically relational databases to objects and applications and something you hear people talk about sometimes is something called the object, relational impedance mismatch. And I wonder if you could explain a little bit about what people mean by that and whether you think that's a problemJulie: [01:00:11] Well, that is exactly what ORMs are trying to solve, aiming to solve. The mismatch is the mismatch between the rows and columns and your database schema and the shape of your objects. So that's what the mismatch is, and it's an impedance for getting your data in and out of your application.So the ORMs are helping solve that problem instead of you having to solve it yourself by reading through your objects and transforming that into SQL, et cetera, et cetera, all those things we talked about earlier. I haven't heard that term in such a long time. That's so funny to me.It's funny that, we talked about it a lot when, and like in the EF world, it was like, what's EF4 it's to take care of the impedance mismatch, duh. Okay. Yeah. We never said duh. Just kidding. But we talked about that a lot. At the beginning of entity framework's, history, and yeah, I haven't had anybody bring that up in a long time.So I'm actually grateful you did.Jeremy: [01:01:15] So maybe, we've gotten to. The point where the tools have gotten good enough where people aren't thinking so much about that problem because ideally entity framework has solved it?Julie: [01:01:28] Entity framework and hibernate and dapper.Jeremy: [01:01:32] I think that's a good place to wrap things up. If people want to learn more about what you're working on or about entity framework or check out your courses where should they head?Julie: [01:01:46] Well, I spend a lot of time on Twitter and on Twitter. I'm Julie Lerman and I have a website and a blog. Although I have to say, I tweet a lot more than I blog these days. but my website is thedatafarm.com and I have a blog there and of course look me up on Pluralsight.Jeremy: [01:02:08] Cool. Well, Julie, thank you so muchJulie: [01:02:11] Oh, it was great. Jeremy, and you asked so many interesting questions and also helped me take some nice trips into my entity framework past to remember some things that I had actually forgotten about. So that was interesting. Thanks.Jeremy: [01:02:26] Hopefully... Hopefully good memories.Julie: [01:02:28] All good. All good.Jeremy: [01:02:31] All right, thanks a lot Julie.Julie: [01:02:31] Yep. Thank you.
undefined
Nov 4, 2020 • 1h 3min

Fixing a Broken Development Process

John Doran is the CTO of Phorest, an application for managing salons and spas.We discuss:- Transitioning a desktop application to a SaaS- Struggling with outages and performance problems- Moving away from relying on a single person for deployment- Building a continuous integration pipeline- Health monitoring for services- The benefits of docker- Using AWS managed services like Aurora and ECSThis episode originally aired on Software Engineering Radio.TranscriptJeremy: [00:00:00] Today I have John Doran with me. John is the director of engineering at Phorest, a Dublin based SAAS company, that processes appointments for the hair and beauty industry. He previously worked as a technical lead at Travelport digital, where he supported major airlines and hotel chains with their mobile platforms.I'll be speaking with John about the early days of their business, the challenges they faced while scaling and how they were able to reshape their processes and team to overcome these challenges. John, welcome to software engineering radio. John: [00:00:29] Hey Jeremy, thanks so much for having me. Jeremy: [00:00:31] The first thing I'd like to discuss is the early days of forest to just give the listeners a little bit of background. What type of product is Phorest? John: [00:00:40] Sure. So forest is essentially, um, It's a salon software focused in the hair and beauty industry. And it didn't actually start off as that back in 2003, it was actually a messaging service actually built by a few students and Trinity college. One of which was his name was Ronan.Percevel. Ronan is actually our current CEO. So that in 2003, that messaging service was supporting, um, nightclubs, dentists, various small businesses around Dublin, and the guys were finding it really hard to get some traction in, in those phase different industries. So Ronan actually went and worked as a, as a hair receptionist in a salon.And what he learned from that was that through using messaging on the platform that they were able to actually increase revenue for salons and actually get more money in the tills, which was hugely powerful thing. So from there, they were able to refocus on the, on that particular industry, they built supplementary features and a product around that messaging service.So in 2004, it became a kind of a fully fledged appointment book. And from there then they, they integrated that appointment book with the messaging service. So by 2006, then I guess you could classify phorest as a full salon software. So it had things like stock take, financial reporting, and staff rostering. That's fully based salon software system was pretty popular in Ireland and actually by between 2006 and 2008, they became the number one in the industry in Ireland. So what that meant was, you know, the majority of salons in Ireland were running on, on the phorest platform and that was actually an on-premise system. So all the data would have been stored locally in the hair salon.And there was no backend, Jeremy: [00:02:30] just so I understand correctly. So you say it was running on premise. It was an appointment system. So is this where somebody would come into the salon and make an appointment and they would enter it into a local computer and it would be just stored there? John: [00:02:46] Exactly. So, so what Ronan figured out throughout his time, working in the salon that by actually sending customers text messages, to remind them about their appointments really helped cut down the no-show rates, meaning that customers did turn up for their appointments when they were due and meaning that the staff members didn't have to sit around, waiting for customers to walk in.So as Phorest I guess, as a company developed. We, we moved into building extra features around that core system, which is an appointment book, which manages the day-to-day, uh, rows of a hairstylist. So we built, uh, email and marketing retention tools around that. Okay. I guess a really important point about Phorest's history is when the recession hit in 2008 in Ireland, we, uh, moved into the UK.So as we were kind of the number one provider in Ireland, we felt that, uh, when the recession hit that we needed to move into the UK, but being on premise meant there was a lot of friction actually installing the system into the salons. So in 2011, they actually took a small seed round to build out, I guess, the cloud backend.Well, once the kind of cloud backend was built, it took about a year to get it off the ground and released. Um, and as the company kind of gained traction in the UK, they, they migrated all of their premise customers onto the cloud solution. Jeremy: [00:04:07] I guess you would say that when it was on premise, a lot of the engineering effort or the support effort was probably in keeping the software, working for your customers and just addressing technical issues or questions and things like that. And that was probably taking a lot of your time. Is that correct? John: [00:04:25] Precisely the, the team was quite small. So we had five engineers who were essentially building it at the cloud backend. And one engineer who was maintaining that Delphi on premise application.So what was happening was our CEO Ronan was actually the product owner at that time. And the guys were making pretty drastic and kind of quickfire decisions in terms of features being added to the product based on, you know, getting a certain customer in that really needed to pay the bills. And some of those decisions, uh, I guess, made the product a bit more complex, uh, as a group, but it certainly was, it was a big improvement from the on-premise solution.Jeremy: [00:05:03] Hmm. So the on-premise solution you said was written in Delphi, is that correct? John: [00:05:08] Yeah, Jeremy: [00:05:09] when it was first started, was it just a single developer? John: [00:05:13] Exactly. Yeah. So it was, it was literally, uh, put together by some, some outsourcers and a single developer managing it. There was no, there was no real in-house developers.It was, you know, a little bit of turnover there. But when that small seed round came in with the guys put that together, the foundations of the cloud-based backend, which was a, a Java kind of classic, uh, n-tiered application with web socket to update the appointment screen. If anything changed on the backend.And, um, you, you would kind of consider it a majestic monolith as such Jeremy: [00:05:44] When you started the cloud solution. Were you maintaining both systems simultaneously? John: [00:05:49] Yeah, so, um, for a full year, that goes where we're building out that backend. And at the same time, there was a one guy who was, who was literally maintaining, fixing bugs on that, that Delphi application.And just to kind of give you an example. Um, one of the guys who was actually working on support, he actually went and taught himself SQL and he used to, to tunnel into the salons at nighttime to fix any database issues. And, um, Jeremy: [00:06:17] Oh, wow.John: [00:06:17] Yeah. So it was, it was, you know, hardcore stuff. Um, another big thing about not being a cloud-based and, and one of the big reasons we needed to become cloud based was we, you know, as, as people move online and, you know, it's, it's quite common to book, you know, your cinema or some something else online, but, um, Ronan could see that trend coming for online bookings, uh, and we needed to be cloud-based to build, to build out that online booking system.And just to kind of give you an idea of the scale, like last year, we, we would have processed about over 2 billion euros worth of transactions to the system. So it's really, it's really growing. And, um, you know, it's huge, huge scale at the moment by that, I guess looking, looking back at the past, the guys would have built a great robust system getting us to that 10,000 salon mark, particularly in the UK, but that would have been the point that the guys would have started seeing some, you know, shakiness in terms of stability and the speed at which at which you could deliver new new features.Jeremy: [00:07:22] You were saying the initial cloud version took about a year to create?John: [00:07:26] Exactly. Yeah. Jeremy: [00:07:27] And you had five engineers working on it after the seed round? At that time, when you first started working on the cloud version of the application, did you have a limited rollout to kind of weed out defects? Or how did you start transferring customers over.John: [00:07:46] So there was definitely some reluctant customers to, to move across. We did it, I guess, uh, gradually there was a lot of reluctance for people. People were quite scared of their data, not being stored in their salon. And so it was quite hard to get those, some of those customers across and only two weeks ago, we, we actually officially stopped supporting that final two customers have finished up. So. You know it took us a good seven years to finish that transition. Jeremy: [00:08:12] Uh, so it was a very gradual transition where you actually, what did you ask customers whether they wanted to move or how did you... John: [00:08:21] Oh, yeah. It was a huge, huge sales and team effort to, to get people across the line. But I would say the majority of people either would have churned or, or would have moved across the, the more forward-thinking people.I, you know, they would have been getting new features and a better service. Jeremy: [00:08:36] Right. So it was kind of more of a marketing push from your side to say, Hey, if you move over to our cloud solution, you'll get all these additional capabilities. But it was ultimately up to them to decide whether they want it to.John: [00:08:47] Yeah. So, um, no, some companies, they. They kind of build that product with a different name and they try and sell it but uh Phorest we actually kept the UI very similar. So it wasn't very intrusive to the users. It was just kind of seen as an upgrade with, um, I guess, less friction. Jeremy: [00:09:06] Right. Right. I want to talk a little bit about the. Early days where you, you said you spent about a year to build the MVP at that point, let's say after that year had passed, were you still able to iterate quickly in terms of features or were you having any problems with performance or downtime at that point?John: [00:09:28] So in 2012, when the cloud-based product launched, particularly in the UK, once we hit about a thousand customers, we started to see creaking issues in the backend, lots of JVM, garbage collection problems, lots of, uh, database contention and lots of outages.So we got to a point where we were trying hardware at the problem to, to make things a little bit faster. So what our problems were. We kind of relied a lot on a single person to do a lot of the deployments. it wasn't really a team effort to, to ship things. It was more so developer finishes code and the machine push it off.Maybe at the end of the month we would ship. I guess the big problem was the stability. So essentially what, what happened was. In terms of the architecture, we were introducing caches at various levels to try and, um, cope with performance. So, uh, a layer of caching on the client's side was introduced, uh, memcached uh, was introduced, uh, level, level 2 hibernate caching, always just, you know, really focusing on fixing the immediate problem, without looking at kind of the bigger picture, once I said, I mentioned that 2000 salons as a marker, I guess once we hit like 1200.The guys had to introduce, uh, the idea of silos, which was like essentially 1000 customers are going to be linked to a specific URL and that URL will host the, the API returning back the data that they need. And then the other silo, which would service the other, you know, 200 growing to say thousand businesses.So essentially if you think about it, you've got, I guess, a big point of failure. If, if that, if that server goes down, There's no load balancing between between servers and those two servers are their biggest size possible. So I guess a big red herring was the, the cost, uh, I guess, implications of that, you know, it was the largest, uh, instance type on Amazon on RDS and EC2 level.Jeremy: [00:11:31] The entire system was on a single instance for each silo?John: [00:11:35] Yeah. So if you imagine, um, when you, when you log in, you'll get returned a URL for a particular silo. So what would happen then would be X businesses go to that silo and X, Y businesses go to the other silo and what that did was basically it load balanced, the businesses at kind of a database level.Jeremy: [00:11:55] You were mentioning how you had like different caching layers, for example, memcached and things like that, but those were all being run on the same instance. Is that correct? John: [00:12:05] Um, they would have been hosted by Amazon. Jeremy: [00:12:07] Oh okay. So those would have been, uh, Amazon's hosted services. John: [00:12:10] So, yeah. Yeah. It's kind of like when you build that MVP or you build that initial stage, your product is kind of, you're focusing on building features.You're focusing on getting bums on seats and you. It was that point, that 12, 1200 to a thousand. salons that where we felt that pain, that scaling pain. Jeremy: [00:12:30] So in a way, like you said, you were doing multitenancy, but it was kind of split per thousand customers. John: [00:12:38] Yeah, exactly. So if you imagine, if, if a failure happened on one of those servers, there is no fault tolerance.If the deployment goes wrong in terms of like, uh, putting an instance in service, those, those thousand customers can't make purchases, their customers can't make online bookings. There's no appointments being served. You can't run transactions through the till. So, uh, we've caused huge, huge friction. Jeremy: [00:13:04] Right. Uh, what were the managed services you were using in combination with the EC2 instance? John: [00:13:12] So, um, a really good decision at the start of the guys moving to cloud was making a big bet on Amazon in terms of utilizing them for RDS, EC2, caching. There was no deployment stack, or there was no deployment, uh, infrastructure as code.It was all I guess, manually done through Amazon console, which is something that we later address, which we'll chat about it, but it was all, all heavily reliant on Amazon. Jeremy: [00:13:38] And you had mentioned that you were relying on one person to do deployment. Was, was that still the case at this time? John: [00:13:46] Yeah, so up until, I guess 2014. Um, It was all reliant on one guy who, who literally had to bring his laptop on holidays with them and tether from cafes. If something went down to deploy new code, he was the only guy who knew how to do it. So it was, it was a huge pain point and bus factor. Jeremy: [00:14:08] So it sounds like in terms of how the team was split up, there was basically, you have people working on development and you have a single person in the sort of ops role.John: [00:14:21] Yeah. And, uh, essentially when, when this kind of thing happens, you, the people who write the code don't ship it, you get all sorts of problems in terms of dependencies and tangles. And, uh, and you know, just just knowledge, knowledge silos, and also, you know, because the guys were working kind of in their own verticals, uh, different areas of the product.There was no consistency. Consistency in terms of the engineering values, how they were working, practices, procedures, you know, deployments, that sort of stuff. It was all, it was all very isolated. So, um, people did their own, their own thing. So, uh, you could imagine for say trying to hire someone new would be quite hard because, um, you know, for someone to come in very, very different, depending on which engineer you talk to.That makes sense? Jeremy: [00:15:10] Yeah, was, was this a locally located team or was this a remote team? John: [00:15:16] Most of the guys were actually in Dublin. Um, one or two people traveled a little bit worked remotely and a couple of people did actually move abroad. So it was predominantly based in Dublin, but, um, some people traveled a bit in terms of processes Jeremy: [00:15:28] For someone knowing how to deploy or how to work on a feature. It was mostly tribal knowledge. It's more just trying to piece together a story from talking to different people. Is that correct? John: [00:15:42] Precisely. So, um, you, you had no consistency in languages or frameworks. Um, except I would say that that model, it, uh, that, that initial part of the platform was extremely consistent, uh, in terms of the patterns used, uh, the. I guess the way it communicated with database and you know, how the API was built, um, was extremely strong and, uh, is, is the heart of steel is the heart of the organization. So say for example, there was a lot of really good, uh, say integration and unit tests there, but they got abandoned for a little while and we had to bring them back, back to life, to, to, to enable us to start moving faster again, and to give us a lot more confidence.Jeremy: [00:16:32] Hmm. So it sounds like maybe the initial version and the first year or so had a pretty solid foundation, but then as. I'm not sure if it was the team that grew or just the, the rate of features. Uh, would you say that? John: [00:16:47] I would say it was a combination of the, the growth of the company in terms of the number of customers on it, and the focus on delivering features.So focusing on feature development rather than tinking about scalability and. Being extremely aware of how, how fast were you gaining customers at that time? Was this a steady increase or large spikes? You're talking to 30% annually. So 30% annually and really, really low churn rate as well. Jeremy: [00:17:17] So what would you feel was the turning point where it felt like your software had, or your business had to fundamentally change due to the number of customers you had?John: [00:17:28] So it was essentially those issues around stability and cost where we're on sustainable for the business customers complaining, uh, our staff not being able to. To do their job. So, you know, part of Phorest's core values and mission is to help the salon owner grow their business and use, use the tools that we provide to, to do that.And if people are firefighting, uh, and not being able to, to support our customers, to be able to send, help them send really great marketing campaigns to boost our revenue, if we're not doing that, um, we're, we're firefighting the company would have been pointless. So we weren't fulfilling our mission by coping with outages and panicking all the time. The costs again, we're, we're unsustainable and you know, the team, you know, it was just, I guess, uncomfortable with this, the state we were in. So the turning point would have been, I would say in like 2014, when we, we essentially hired in some people who have more, more experience in.I would say the high scalability systems and people who, who cared a little bit more about quality and best practices. So when you hire a three or four people like that, you kind of, you bring in a, a different way of thinking you kind of, you hire, hire, hire these dif different values. You know, when you, when you try to. To talk to a team and try and get these things out. They're normally quite organic. If you bring people in from maybe a similar comp at all from a different industry, but similar experience, you, you kind of get that for free. And that's what Phorest did. So, um, basically in 2014, and since now, we've, we've invested heavily in hiring and hiring the right people in terms of how they operate and then in terms of how they think but also bringing that back to our our values and, um, and what we try to do, Jeremy: [00:19:28] Do you think that bringing in, you know, new new people, new talent is really one of the largest factors that allowed you to make such large changes to change your culture and change your way of thinking? John: [00:19:41] The other thing would be, I would say the trust, um, that Ronan CEO and the leadership team Phorest has, um, and their openness to change.Um, I think that, uh, a lot of other organizations will be quite scared of this type of a change in terms of heavily investing in the product to make it better, just like from experience and talking to the people, you know, would have been very easy to, to not invest, uh, you know, and just leave the software ticking along with bugs and handling the downtime, but it was, it was about the organization and their value, their value is around really helping, helping the salon owners and not spending that time firefighting. Jeremy: [00:20:28] So it sounds like within two years or so of, of launch was when you, uh, decided to, to make this change. John: [00:20:38] Yeah. So, um, you know, it's not an, not an easy one to make because you know, it's really hard to find talent.Um, And we, we, we were lucky to, to really get some, some great people in, and it wasn't about making radical change at the start. You know, it started from foundations. So it was teams like, you know, let's get a continuous integration server going here, guys, and, you know, let's bring all of that. Let's bring back all the broken tests and make sure they're running so that we can have a bit more confidence in what we share.W we, you know, introduce code review code reviews and pull requests back in into things and a bit more collaboration and getting rid of those pockets of knowledge. Um, you know, reliance on individuals. Jeremy: [00:21:21] I do want to go more into those a little bit later, but before that, when you were having performance issues or having outages before all these changes, how were you finding out?Was it being reported by users or did you have any kind of process, um, you know, to notify you? John: [00:21:41] So the quite commenting was basically the phones, which would light up. Um, there was very, very little transparency of what was going on in the system. It got to a stage where we actually installed a physical red button on support floor, which, uh, texted everyone in the engineering team.Jeremy: [00:22:00] Oh, wow. Okay. John: [00:22:02] Yeah. Jeremy: [00:22:02] One of the things that we often hear is when a system has issues like this, it's difficult to free up people to fix the underlying problems, um, due to the time investment required. And as you mentioned, all the firefighting going on, how did you overcome this issue? John: [00:22:24] So I guess. You know, the beforehand it was, it was a matter of, you know, restart the server.Let's keep going with our features, but it was really bad stopping to think about, um, you know, what really happened here. And, you know, maybe let's write down an incident report and gather some data, but, well, what actually happened under the hood and a few things, you know, a few questions, key questions could be raised from that.You know, what are we going to do to stop this from happening again? Why didn't we know about it before the customers and, you know, What were the steps we made to, to reproduce some, actually fix this issue and, and what are the actions that are going to happen and how are we going to track that those actions do happen after, after the issue?Jeremy: [00:23:08] Let me see you. If I understand this correctly, you, you actually did build sort of a process when you would have incidents to figure out okay. John: [00:23:17] That was the first step I would say. Yeah. So let's figure out what happened and how, and it was just about gathering data and getting information about what was, what was really going on.So let us identify as, you know, common things that happens that may be usually we would just, you know, restart server and forget but or fail over database and forgotten, you know, everything's not normal and a couple of errors, but as we started gathering that data, we started to see common problems. So maybe, you know, Our deployment processes isn't good enough and it's error prone, or this specific message broker isn't fault tolerant, or the IOPS in the database are too high at this time due to these queries happening.But after we got that data, you know, uh, and we started really digging deep into the system. We realized that this isn't something that you could just take two days in your sprint to start to fix, uh, go, just coming back to your question on, uh, finding that time to, to fix things where we kind of had to make a tough call.When we looked at everything to, to say, you know, let's stop feature work and let's stop product work. And. Let's fix this property. Jeremy: [00:24:26] Okay. Yeah. So, so basically you got more information on, on why you were having these problems, why you were having downtime or performance issues and started to build kind of a picture of, and realize that, Oh, this is, this is actually a very large problem.And then as a company, you made the decision that, okay, we're going to stop feature development. To make sure we have enough resources to really tackle this problem. John: [00:24:55] Precisely. And, um, from the product side of things, you know, this was a big, big driving factor in it. You know, we wanted to build all these amazing features to help salons to grow, but we just couldn't actually deliver on them.And we couldn't have any predictability on the way we deliver them, because because of that firefighting and, you know, cause we were sidetracked so much. There was no confidence in, in release cycle and stability or, or what, what we could actually deliver. So, um, yeah, it was, it was a pretty hard decision for us to make in terms of, uh, the business.Cause we haven't had a lot of deliverables and commitments to customers and to, you know, to our sales team. So we, we have to have to make that call. Jeremy: [00:25:36] You were mentioning earlier about how you started to bring in a continuous integration process before you had done that. You also mentioned that there were tests in the system initially, but then the tests kind of went away.Could you kind of elaborate on what you meant by that? John: [00:25:53] Yeah, so. As I said, like the kind of the core system was built with a lot of integrity and a lot of good patterns. For example, uh, a lot of integration tests, uh, running against, uh, the APIs, uh, were written and maybe were written against a specific feature, but they were never run as a full suite.So, um, what would happen was there'd be maybe one or two flaky ones. And, um, you know, because there was no continuous integration server, it was, it was easy enough for a developer to run specific tests for that, uh, functionality that they were were were building. But because there was the CI wasn't there, there was no full suite ran.So when, when it came time to actually do that, we realized, you know, So, you know, 70% of them are broken, Jeremy: [00:26:40] so they, they were building tests as they developed, but then they were maybe not running those, uh, John: [00:26:47] before commit or mergeJeremy: [00:26:49] right. And so adding the continuous integration process, having, uh, some kind of build process, really forced people to pay attention to whether those tests were working or not John: [00:27:01] Exactly. Um, and. Just a kind of a step on from that was, you know, um, a huge delay in getting stuff to test because, because we relied on that onw guy to build stuff. Um, and actually that was, you know, done from a, you know, a little Linux box in the engineering floor, um, which was quite temperamental.Uh, you'd be quite delayed and actually in even just getting stuff into people's hands and kind of what the core of software development is all about right, you know, Getting getting what you build into people's hands and we just couldn't do it Jeremy: [00:27:34] Just because the, the process of actually doing a build and a deployment was so difficult when you added the continuous integration process.Uh, were there other benefits that you maybe didn't expect when you started bringing this in?John: [00:27:50] So, um, I, I guess I mentioned the deployments is a big one. I think that. People started to see real, um, benefit in terms of their workflow. I guess, along with the continuous integration, there was, uh, more, more discipline in terms of, uh, how we worked.So the CI server introduced a better workflow, uh, for us on a, it helped us see real clarity, uh, in terms of the quality of the system, where, where we had coverage, where it didn't and, um, It also helped us break up the system a little bit. So I mentioned majestic monoliths. So it was actually when, when we went to look at it, there was five application servers sitting in one repo and the CI server and some crafty engineering helped us split that up quite well.To break at the repo into multiple application servers. Jeremy: [00:28:45] Hmm. So, so actually bringing in the continuous integration actually encouraged you to rearchitect your application and in certain ways, and break it down into smaller pieces. John: [00:28:56] Exactly. Yeah. And really it was all about confidence. Um, and being able to test and then know that we weren't progressing.Jeremy: [00:29:03] What do you, you think people saw in terms of the pain or the challenges from that sort of monolith set up that you think sort of inspired them to break it up? John: [00:29:13] The big one was a bug fix in one small, area of the system meant the whole stack had to be redeployed, which took hours and hours. The other thing would have been the speed of development in terms of navigating around a pretty large code base and the slowness of the test suite to run, which was around 25 minutes.When we, when we started and got them all green, Jeremy: [00:29:36] the pain of running the tests and having it possible to break so many things with just one change, maybe encourage people to, to shrink things down. So they wouldn't have to think so much about the whole picture all the time. John: [00:29:50] Exactly. We started to see, you know, a small fix or a small feature breaking something completely non-related typical example would have been due to a HTTP connection configuration on a client, um, breaking completely on, unrelated areas of the system. Jeremy: [00:30:06] Okay. One thing I'd like to talk about next is the monitoring. Uh, you mentioned earlier that it was really just phone calls would come into support and you even had the, the big red button, you could press uh, what did you do to add monitoring to your application?John: [00:30:25] That's pretty, uh, important to mention that, you know, we talked about making a decision to stop to down tills (?) and start fixing stuff. So that's, that's when, uh, we, we started, you know, looking at the monitoring and everything else, like continuous integration, bringing back tests, but at kind of a key point of, uh, of this evolutionary project was, was the monitoring.Um, so we did a few things. So we, we upgraded our systems to be using new relic. To help us find errors and it was there, but, um, it wasn't being utilized in a good enough way. So we used the APM (Application Performance Management) there. We looked at CloudWatch I mean we introduced watch metrics to help us watch traffic, to help us see slow, uh, transactions.Um, log entries helped us a lot. Uh, in terms of spotting anomalies in the logs, Pingdom was actually a, a really surprising, um, good addition to the monitoring. Um, it's simply just, just calls any health check, endpoint you want. And. That has some, some nice Slack and messaging integration. It was, that was great for us.It's helped us a lot. So we did a couple of other things like, um, some small end to end tests that would, um, Give us a kind of a heartbeat to how the system was running. Um, and, and they were also gave us the kind of confidence that we would know about an issue before a customer, being able to allow us to get rid of that red button.Jeremy: [00:31:53] All of these are managed services that, that you either send logs to or check health end points on your system. Did you configure them somehow, too text your team or send messages to your team when certain conditions were met or John: [00:32:11] so we, we, we started with just like a simple Slack channel that would, uh, would send us any kind of dev ops related issues into, into there.And that that's kind of what helped us change the culture a little bit in terms of being more aware of the infrastructure and the operations. And Pingdom was great for set, setting up a team with people who, who would get notifications for various parts of the system. And, uh, our CloudWatch, um, alarms, we set up a little Lambda function that would just forward on uh any critical messages to text us. Jeremy: [00:32:44] And before this, you said it was basically just user calls and now you are actually shifting to kind of proactively identifying the problems. Yeah. John: [00:32:54] Yeah, exactly. There was some small, really small alerts there, but nothing as comprehensive as this. We actually, um, we changed some of the applications. We introduced health end points to all of them. So they would report on their ability to connect to message broker, their ability to connect to a database, any dependencies that they actually needed. We would actually check as part of pinging that endpoint. So, if you hit any of our servers, we, any new or older ones, they would all have like a forward slash health endpoint and that would give you back a JSON structure, uh, and give us a good insight into how healthy that component was.Jeremy: [00:33:33] Yeah. And if there was a problem and you were trying to debug issues, were you mostly able to log into these third-party services, like log entries or new Relic to actually track down the problem? John: [00:33:46] Yeah. So again, th those services gave us that information, but it would always come back to, you know, being, if you needed to get into a server and a big thing, which we'll talk about is Docker.Um, we, we don't have SSH access into those servers. So we rely on those third parties to give us that information. But in the past, maybe we would have had to get in and, you know, look at the processes and take dumps, but. With log entries and new Relic, we were able to do that stuff without needing to. Jeremy: [00:34:17] So previously you might have someone actually SSH into the individual boxes and look at log files and things like that.John: [00:34:25] Exactly. So it's quite easy when you've got one server, but with that, as we'll discuss where you've got many small containers and it's extremely complicated Jeremy: [00:34:34] Next I'd like to talk about, uh, since you mentioned Docker, uh, how did you make the decision to move to Docker? John: [00:34:42] So it was something our CTO was really aware of and he really wanted us to explore this.The big benefits for us was that shift in mindset of one guy not being responsible for deployments, but us actually developing and using Docker in our day to day workflow and the cost implications as well. The fact that we could, instead of having that say eight X large, we could have. Running one application server.We could have 12 containers running on much smaller containers running on an EC2 instances. So it was that idea of being able to, to maximize, uh, CPU and memory. What was a huge, huge benefit for us that we, we, we saw. Jeremy: [00:35:24] So the, the primary driver was, was almost your. AWS bill or your John: [00:35:30] Big time. Yeah. Portable applications that, um, you know, w had much less maintenance. We didn't have to go in and worry about it because we had a, I guess, a. We mentioned this earlier, like these kinds of silo tech stacks, we didn't need to worry about a Ruby environment or PHP environment or a Java JVM install. It was just the container.And that was a hugely big, an important thing for us to do and really kind of well thought out by our CTO at the time. Jeremy: [00:35:59] So, so you mentioned like Ruby containers and JVMs and things like that. Does your application. I actually have a bunch of different frameworks and a bunch of different languages?John: [00:36:09] Yeah. So, um, as we split out the, that monolith, uh, we also, I guess, started building smaller domain specific, not micro I'd say kind of uh, services responsible for areas of the system, uh, our online booking stack. So if you go to any of our customers, um, you know, you can book a and their point of sale system in the salon, but you can also book on your phone and we have a custom domain for every one of those salon. So it's like phorest.com/book/foundationhair.Um, if you click on that, you're going to be brought to the online booking stack, which is a, a rails app actually in a, an Ember, Ember JS frontend. So, um, the system, as we started splitting it apart became more and more distributed and Docker was great for us in terms of consistency. And that portability particularly around different text stacks. Jeremy: [00:37:01] Migrating to Docker, made it easier for you to both develop and to deploy using a bunch of different tech stacks.John: [00:37:08] Exactly. Jeremy: [00:37:09] When running through your CI pipeline, would you configure it to create an image that you would put into a private registry such as Amazon's elastic container registry? John: [00:37:18] Yeah. So we made the mistake of building and hosting our own registry at the start. Uh, we quickly realized the pain and that around three, four months in where I'm actually at the same time as Amazon released the ECR.So I guess the main reason we did that ourselves was because we were early adopters and we pay, paid a little tax on that, but we did, uh, we moved to ECR. So. Our typical application kind of pipeline is uh build unit tests, maybe integration, acceptance tests, build a container. And then some of those applications, they run acceptance tests against the container running on the CI server, uh, push, push to the registry.And after it's pushed to the registry, then we would configure deployment and trigger it. Jeremy: [00:38:02] Do you have a manual process where somebody watches the pipeline go through and you make the call to push or not? Or is it a automated process? John: [00:38:13] Automated. So, um, we built a small kind of deployment framework again, because we were early adopters of Amazon's ECS, uh, their container service.Uh, so we built a small, um, deployment stack, which which allowed us to, to essentially configure a new services in ECS and deploy new versions of our application through CI to our, uh, ECS clusters. So it was all automated using an infrastructure as code solution, such as cloud formation. So when we were in looking back at the problems in the old good old days, uh, you know, we seen that one was, you know, uh, things were just configured on the AWS console.And we, we knew we needed infrastructure as code and we needed a repeatability and the ability to recreate stuff. So we, we use cloudformation and essentially what face something very similar to terraform. Um, and we do use Terraform for, for some of our managing our restaurant clusters on some other things.Jeremy: [00:39:16] Okay. So you maybe initially moved from having someone manually going in and creating virtual machines to more of a infrastructure is code approach. John: [00:39:27] Exactly. Yeah. Jeremy: [00:39:28] You, you had mentioned that one of the primary drivers of, of using Docker was performance, did you start creating performance metrics so that you knew how much progress you were making on that front?John: [00:39:41] Yeah, so essentially that the effort to kind of make our infrastructure more reliable it's it was a set as kind of a set of steps to get there. And we started with API level testing to make sure that anything we change under the hood, it didn't break the functionality. And we also wrote a bunch of performance tests, particularly around pulling down appointments, creating appointments and sending large, large volumes of messages.We, we knew we couldn't have any regressions there. So. We use gattling to do those performance tests. And we, we would run that from continuous integration server and we do various types of soak testing to make sure we weren't weren't taking any steps backwards. Jeremy: [00:40:23] So each time you would do a deployment, you would run performance tests to ensure that you weren't getting slower, or you weren't having any problems from the new deployment.John: [00:40:34] Yeah, I would say though, that like, uh, this kind of effort and we called it project Darwin internally, this effort to kind of. It had a few goals, but it was all about, you know, becoming fault-tolerant being more scalable, reducing Amazon costs. And during project Darwin, when we, we didn't just move our 1200-1500 salons we didn't just drop them and move them to docker.There was so many changes under the hood that these performance tests were key to giving us, uh, a pulse on how we were doing. But, um, I guess when we were done with project Darwin and every, everything was onDocker and, and everyone was much, much happier. Um, we, we just, we run those performance tests, ad hoc and as, as part of some release pipeline.Jeremy: [00:41:21] Hmm. Okay. So initially you were undergoing a lot of big changes and that was where it was really important to, uh, check the performance metrics with, with every build. John: [00:41:32] Exactly. Yeah. Jeremy: [00:41:33] Uh, what were some of the, the big challenges? Cause you mentioned you were changing a lot of things. What were some of the big challenges moving your existing architecture to Docker and to ECS?John: [00:41:47] There was a couple. The biggest there's two huge ones. So one was state. Getting state out of those big servers was extremely hard. We needed to re remove the level two cache because we need, because we needed to turn that one server into smaller, load balanced containers. We needed to remove the state because we didn't want somebody in one term computer terminal, fetching our appointments, and then on their first go mobile app looking at different data.So we had to get rid of state. And the challenge there was that MySQL performance just wasn't good enough for us. So, um, we actually had to look really hard at migrating to Amazon Aurora, which is what we did again, coming back to cost. Uh, Aurora is much more cost-effective in terms of the system beforehand was provisioned for peak load.So we, we would have provisioned IOPS for Friday afternoon. The busiest time that the salon was using the system. And we were paying for the same amount on a Sunday night. Compared to Aurora where you're, you're paying for IOPS and the additional benefits of performance around how Amazon rebuilt the storage engine there.So that's the caching side of things. The other big, big challenge was the VPC. So when you needed to get all of our applications in, into a VPC to to be able to use the latest instance types on Amazon, uh, and also for our application to be able to talk security to Aurora database. So those two are definitely the biggest challenges with the MySQL setup.Jeremy: [00:43:19] It sounded like you had to pay for your peak usage, whereas with Aurora it automatically scales up and down. Is that correct? John: [00:43:29] Um, no. You're actually charged per read and write. So that would be the big difference.Jeremy: [00:43:34] Oh I see. Per read and write. Okay. So it's just per operation. So you don't actually think about what your needs are going to be.It kind of just charges you based on how it's used. John: [00:43:45] The other new really nice thing was, uh, looking back at our incident reports, a really common issue would have been, Hey, the database has run out of storage and Aurora does actually autoscale its storage engine. Jeremy: [00:43:56] You mentioned removing state from your servers and you, you mentioned removing the level two cache, can you kind of explain sort of at a high level, what that means to those who don't understand that? John: [00:44:10] Sure. So in the Java world, when you have an ORM framework like hibernate, essentially when you create a database, that cache will store that data, um, in its level two cache. And what that means is that it doesn't need to hit the database for every query.And that's the, that was the solution for, for Phorest as we were in that MVP slash early days. But it wasn't the solution for us to become fault-tolerant. Jeremy: [00:44:39] So it would be someone makes a query to an ORM in this case. It's and it's hibernate and, uh, on the server's memory, it would retrieve the results from there instead of from the database.John: [00:44:55] Yeah, exactly. Okay. And that's what I was coming back to around, um, creating an API for a list of appointments. If you had two servers deployed with, with them using an L2 cache, you would get different state because cache, Jeremy: [00:45:12] you put a different cache in place, or did you remove the cache entirely? John: [00:45:17] So we removed that cache entirely, but we did have a rest cash, which is, was memcached and that's distributed. And we use cache keys based on, uh, entity versions. So that was distributed and, and worked well with multiple containers behind a load balancer. Jeremy: [00:45:34] So you removed the, the cache from the individual servers, but you do have a managed Memcached instance that you use. John: [00:45:42] Yeah, exactly. And getting rid of that level two cache.Our performance tests told us that MySQL just wasn't performance enough. Whereas Aurora was much better at handling those types of queries, some large joins. It was a big, big relational database. Jeremy: [00:45:58] So we we've talked about adding in continuous integration, monitoring performance metrics, uh, Aurora Docker. Did any of these processes require large changes to your code base? John: [00:46:13] To be honest, not really. It was more of a plumbing things together and a lot of orchestration from a human point of view. So, um, people being aware of how all this stuff works and. Uh, essentially making sure that we all knew we're on the right page.I don't like the biggest, uh, piece of engineering and coding work was the deployment and infrastructure script. So provisioning the VPCs writing the integrations with ECS, uh, that, that sort of thing. But, um, in terms of actual coding, it wasn't too invasive. Jeremy: [00:46:47] I think that's actually. A positive for a lot of people, because I believe there are people who think they need to do a big, uh, rewrite if they have, you know, performance problems or problems, keeping track of the status of their system.But I think this is a good case study that shows that you don't necessarily need to do a rewrite. You just need to put additional processes and checks in place and, um, maybe change your. Deployment process to kind of get the results that you weren't John: [00:47:21] It's about the foundations as well. If you have some really strong people at the start who, you know, pave some good uh, roads there in terms of good practices, like just for example, a really good, uh, Database change management set up some good packaging of the code.Really good packaging of the code. So it was quite easy for us to slip out five services from that big monolith. It's about the foundations at the start, because it would be quite easy to, to build an MVP with some people who raised, you know, 1000 line PHP scripts and the product works and that's a different case study because, you know, you CA you can't fix that essentially.Jeremy: [00:48:04] Right? So it's because the original foundation was strong that you were able to undergo this sort of transformation John: [00:48:12] truly yeahJeremy: [00:48:13] adopting all of these processes, did they resolve all of the key problems your business faced? John: [00:48:21] When we, when we look back and we see that, you know, all of our systems are running on docker, we see a huge cost benefit.So uh that problem was certainly solved. We, we were able to see issues before our customers, so we have better transparency, uh, in the system. No longer was, uh, were we dependent on one big server uh, a 1000 customers were no longer dependent on one big server. Um, so it meant that we had really good fault and we do have really good fault tolerance on, on those containers.If one of them dies, ECS will literally kill it. Uh, bring up a new one. Uh, it will also do some auto scaling for us. Say on a Monday morning, it will, you maybe have eight containers running, but on a Friday, maybe it'll auto scale to 14. So that's been ground bre breaking for us in terms of how we work. We went from shipping monthly to quarterly from between monthly and quarterly to, to daily.And something I use as a, uh, a team health metric right now is, is our frequency of deployment. And I'd say we're hitting about 25 deployments a week now, compared to the olden days is, is, is great. We always want to get better at it. I would say that those have been really amazing things for us, but also in terms of the team, it's, it's a lot easier for us now to hire a new engineer, um, bringing them in because of this consistency.And also, um, I guess, uh, we're not relying on these pockets of knowledge anymore. So we, again, around hiring it's, it's a lot easier for someone to come into the system and, and know how things work. And I think in terms of hiring as well, when you talk about the kind of setup it's, uh, it's, you know, you know, there's some, some good stuff happening there.Jeremy: [00:50:10] It sounds like you have a better picture in terms of monitoring the system, you brought your costs down significantly. The deployment process is much easier. The existence of the containers and ECS is kind of serving the purpose of where people used to have to monitor the individual servers and bring them up themselves.But now you've sort of outsource that to Amazon, to take care of. Uh, does that sound, does that all sound correct? John: [00:50:42] Yeah. Spot on. Jeremy: [00:50:43] And I find it interesting too, that you had mentioned that improving all of your process to use actually made it easier to bring new people in. And that's because you were saying things are now more clearly defined in terms of what to do, rather than all of this information kind of being tribal in a sense.John: [00:51:06] Yeah. Like a typical example will be like, Hey, uh, let's redeploy this, uh, bug fix. And so previously, you know, it might be a capistrano deploy or, uh, you know, oh you need to get SSH keys to this thing and, you need to log in here and you need to build this code. On this local machine and try and ship it up.And that just all goes away. Um, particularly with Docker on that, that continuous integration pipeline is just, it sets a really good set of standards and things that people should find quite easy and repeatable. Jeremy: [00:51:40] And, uh, so now in terms of deployment, you can use something like cloud formation and you have the continuous integration process that can deploy your software without somebody having to know specifically about how that part works.John: [00:51:58] Exactly. So I would say if we wanted to create like a new service responsible for some new, new functionality in Phorest, uh, say a spring boot application, a Java application. They can simply provide a Docker file and get that deployed to dev staging or production with, I would say 10 lines of YAML configuration.So you could go from initial set up of a project to, to production in a day. If you wanted to just. Zero friction there I would say. Jeremy: [00:52:29] It really makes the onboarding a lot easier then do you think your team waited too long to change their processes? Or do you think these changes came at just the right time? John: [00:52:42] I would say if we waited any longer, it could have been detrimental to, to, I guess, the health of the business.I think that the guys did a great job in terms of getting us to a certain point. Well, we would have risked technical decay, I would say. And, uh, what kind of, uh, really, uh, harming the organization. If I had gone any further, I would say it was, it was a lot of work to do this and it could have been easier if.If we had paid more attention to technical debt and making the right decisions earlier on. So maybe saying no to that customer who wants a bespoke piece of functionality, well, you have to do what you have to do. Jeremy: [00:53:24] So, so you would say maybe identifying earlier on just how much the current processes were causing you problems.If you had identified that earlier, um, you think you might have made the decision to try and make these changes, uh, at an earlier time. John: [00:53:44] Yeah. So the guys earlier were, were making really good decisions, but maybe they didn't have the experience for, you know, higher scale at the scalability solutions and systems.So it's, it's about hiring the right people at different stages of where the product is evolving. I would say. Jeremy: [00:54:00] Given what you've gone through with the migration of Phorest, what advice would you give to someone building a new process? What, what can they do to keep ahead of either technical debt or any of the other issues you face?John: [00:54:18] I think it's about how it's, it's actually, uh, a people, um, and cultural thing along with tech decisions. So. Everybody needs to be really aligned in terms of these decisions that they're making, rather than letting people go on an individual basis. I think there needs to be good leadership in terms of getting a group of people thinking the same way.I reckon the technical currency is, is extremely important. And as your system grows, you need to be able to, to look, look back. And identify areas of pain and by pain, I mean, you know, speed of deployment, uh, speed of development, ability to adapt and change your software. So if you notice that a feature that used to maybe take a week has now taken two weeks.You know, you probably need to take a really hard look at that area of the system and figure it out. Could it be simplified? Um, and why, why is it taking too long? Jeremy: [00:55:21] Basically identifying exactly where your pain points are, um, so that you can really focus your efforts and, and have an idea of what you're really going for.John: [00:55:31] Yeah. You need to build, um, an environment of trust. And I will also say that you need to be able to.To be able to be calm, confident, and okay with failure in terms of take taking risks sometimes and saying no to features and customers to be able to, to push back on, on leadership and make sure that you're, you're really evolving the system the right way.Uh, not just, uh, becoming a feature factory. Jeremy: [00:56:01] Yeah. It's always going to be a kind of balance on, you know, how much can you pull back, but still stay competitive in whatever space you're in. John: [00:56:12] Yeah. So what, what we're doing right now based on those lessons is we tried to do like a six to eight week burst of work.And we would always try and take a week or two wiggle room between that and starting something new to look back at what we just built and make sure we're happy with it. But also look at our, our, our technical backlog and. See if there's anything there that's really pain, you know? And just, even for example, this week, we, we noticed an issue with a lot of builds failing on our CI because of, uh, how, how it was set up to push Docker images.So, okay. Usually they would fail and that was actually a real pain point for us. Just over the last couple of months, because maybe a deployment, which should take 20 minutes was taking 40. Cause you'd have to re trigger it. So that's just like, that's an example of us looking at what, what was high value and making sure we just fix it before we start something new.Jeremy: [00:57:08] So making sure that you don't kind of end up in the same situation where you started, where. These technical issues sort of build without people noticing them instead kind of in shorter iterations, doing sort of a sanity check and making sure like everything is working and we're all going in the right direction.John: [00:57:27] Yeah. It's about the team. And I mentioned before, it's about, you know, the leadership and a group of people together. Talking through common issues and, you know, maybe meet, meet every two, three weeks. Talk about some key metrics in the system. Why is it this too high? Why is this too low? You know, you can throughkind of through your peers you can really see the pain points and, and they'll, they'll. More than likely tell you them. Jeremy: [00:57:51] When you look back at all the different technologies and processes you adopted, did you feel that any of them had too much overhead for someone starting out? What was your experience in general?John: [00:58:04] So some people just didn't like doing code reviews. Some people just really just felt that. They could just push, push what they needed and that it was almost a, a, a judgment on them in terms of the code review process, which it totally wasn't. I would say, uh, some people found Jenkins and continuous integration a bit, you know, what's the point.And so we, we had had some, you know, some pain points there. Um, but as we got to Docker, as people seeing the benefits of, of these things, you know, less bugs going into production, uh, less things, breaking people, being able to go home nice and early in the evening and not be woke up in the middle of night with, uh, you know, an outage call.Those were all the, the, the benefits, and that's reaping the rewards of, of thinking like this. Jeremy: [00:58:56] Your team was bringing on a bunch of new things at once. What was your process for adopting all these new things without overwhelming your team? John: [00:59:06] So it was starting at the foundation. So the continuous integration, the code reviews where we're incrementally brought in and we had regular team meetings to discuss pros and cons.And it was really important for people to, to input on those things rather than to, to, to just implement them. They would have failed if we hadn't done it like that. It took time. I know it's still, I would say we're still not in a perfect world, but it's about group, group consensus and making sure that everyone everyone's bought in to what we're trying to achieve.Jeremy: [00:59:39] So basically getting everyone in the same room and making sure they understand what exactly the goal is. And everyone's on the same page. John: [00:59:47] Yeah. So we tried to make a big efforts, uh, particularly for people who are working remotely to get them all in the same room. Once a quarter, we talk about our challenges, talk about our goals, talk about our values and make sure we're all on the same page.And sometimes we tweak them and you know, that's how we feel. It's best to do it. Jeremy: [01:00:09] Finally, I want to talk about what's next for, for Phorest. What are the remaining pain points you have now. And what do you plan on tackling next? John: [01:00:20] So right now we're on 4000 salons on our platform. We're really happy with the state of the infrastructure to get us to maybe 8000-1000 salons, but we need to be really conscious of the company's growth and our goals.So we need to make sure that we can scale at a much bigger level. And we also need to make sure that our customers aren't affected by, uh, our growth. We were looking at serverless for any kind of newer pieces of the product to see if they can help us reduce costs even more and, and help us stay, stay agile in terms of our infrastructure and how we roll out a couple of years ago, when we launched into the USA, we noticed we, um, It doubled our overhead in terms of infrastructure, operations, and deployment.And as we grow in the U S we, we need to be really conscious of not making eh any, um, I guess, uh, mistakes from the past. Jeremy: [01:01:15] So you're mostly looking forward to additional scaling challenges and possibly addressing those with serverless or some other type of technology. John: [01:01:27] Yeah. So, um, one area in particular will be our SMS sending.So that's kind of. A plan for the next six to eight months would be to make sure that we can continue to scale at the growth rate of SMS and email sending, which is, is huge in the platform. Jeremy: [01:01:44] Um, you said so far, you've been experiencing 30% growth year over year. And you said when you moved to the U S you actually doubled your customer base?John: [01:01:56] I'd say we doubled our, uh, overhead in terms of infrastructure. managing (?) deployments. We we're still very early stage in the US and that's our big focus for the moment. But as we grow there, we, we need to be, I guess, more operationally aware of, of how it's, how it's going over there. There's a much bigger market. Jeremy: [01:02:17] To kind of cap it off.How, uh, how can people follow you on the internet? John: [01:02:21] Sure. So you can grab me on Twitter at Johnwildoran , J O H N W I L Doran. And if you ever wanted to reach out to me and talk to you about any of this type of stuff, I'd love to meet up with you. So feel free to reach out.
undefined
Oct 7, 2020 • 1h 10min

Building Maps using Leaflet

We cover:Choosing Leaflet vs other mapping librariesSources for mapping layersUsing GeoJSON to store dataRaster vs vector dataWorking with live positions such as car dataPicking a database with geospatial queriesUsing frontend frameworks with LeafletLeaflet pluginsDesktop vs MobileHis work on the Leaflet-Geoman pluginRelated LinksLeafletLeaflet-GeomanGeoJSONGeomanMapboxThis episode was originally on Software Engineering Radio.TranscriptYou can help edit this transcript on GitHub.Jeremy: Today, I'm speaking with Sumit Kumar. Sumit is the head of engineering at Share Now, which is previously car2go. He's also the creator of an open source plugin called leaflet-geoman, which provides drawing tools for the leaflet mapping library.I'll be speaking with Sumit about his experience working with leaflet and how developers can build mapping applications with it. Sumit, welcome to software engineering radio.Sumit: Hello. Thanks for having me.Jeremy: So the first thing I'd like to start with is for people who aren't familiar with leaflet, what is it and what types of projects would people build with it?Sumit: So leaflet is basically a mapping library, a JavaScript library for, mobile friendly interactive maps. That's how they say it. And if you want a map and you don't want to use Google maps, for example, for whatever reason, leaflet is basically an open source alternative to Google maps and other mapping providers. Of course, you still need a map itself. So basically the images of, of a map where you can have open street map or any other provider, but with leaflet you can. , if I would have to describe it, you basically create the layers on top of the map. So that means polygons, markers, points of interest to show any data that you might have and to zoom around the map and stuff like this. So it's the library around the map itself and gives developers really good tools to do their own mapping solutions.Jeremy: I'm sure a lot of people are familiar with Google maps. What are some of the main reasons you might not choose it?Sumit: So in big corporations a lot of companies don't want to be tied in. There are licensing reasons especially in Germany or in Europe, there are data protection reasons. There is simply a stigma attached to Google in that sense. Then there are Google uses a custom format for the geo data and leaflet you can use an open standard called GeoJSON.You can use it with Google too, as far as I know, but it basically will transform everything to the Google format and it's much more expensive. That is also [a] reason, especially for startups or, individual developers with side projects. If they have a big volume, if they expect bigger traffic, then Google maps is quite expensive.Jeremy: So what are some examples of sources people would use?Sumit: Yeah, that's a good question. Normally people use openstreetmap which is free to use. But I think the Google street maps are not very pretty. So all the projects that I do, even if it's open source, I use Mapbox. Mapbox is a company that I would say it's built on top of leaflet.Last time I checked even the maintainers or the core maintainers of leaflet work at Mapbox. So the company grew out of leaflet, from my understanding, and it's highly compatible with leaflet and they provide beautiful maps. This is not free. So I pay for great maps to use even for my open source projects.But you can use basically any provider you want. So there is also HERE maps. It's a European provider. There is Google maps. Of course you can use that. I'm not entirely sure about Apple maps if you can use that with leaflet, but any provider where you can fetch the tiles you can implement it into leaflet and leaflet is a mapping tool. That means it's not about only about like satellite maps or street maps. You can also use indoor maps. You can use a map of the Moon or Mars. You can use maps of video games or Game of Thrones it doesn't matter at the end.Any image that you can take from an area can be a map. And this could be even a (blueprint) a map of your house. You could even use that as a digital map and create whatever you need on top of it.Jeremy: And in that case would you have a JPEG or a static image and then you reference that so someone's able to click around that?Sumit: I've never done a Tile layer myself. But I've seen an application built with the library that I maintain. And this is a SaaS application for construction sites. So they fly a drone on top of the construction site, take, some HD photo of it, or even 4K, I'm not entirely sure.And then they create a map out of it. That they then draw on top of on the construction site which is really interesting. And as far as I know, it's, it's quite easy for them to make a tile layer out of it. The biggest problem with a tile layer is always zooming in and out of the map. So you need different distances basically.And of course the file size. So if you do an HD Photo and you opened a map and it has to load multiple HD photos. This creates quite some load and is slow for the user. So this optimization is the hardest part. But other than that, if you research how to do that, I'm pretty sure there are tools for that to make it quite easy.Jeremy: When you talk about tiles for a map is it where you're starting with a really high resolution image and then you're cutting it into a bunch of pieces so as the person zooms around the map they're not having to load that entire image?Sumit: It's optimized for exactly this. So it loads as many tiles that are visible onscreen. So if you zoom in. Then it only loads the tiles that are visible. These are usually I would say six images or so. And they should be quite, quite optimized in size.If you have a huge image that covers a big area. If you zoom out then instead of loading 1000 tiles, you load only six tiles that show more area but in less detail. The load for the user is basically always the same, doesn't matter which zoom level, and you start with a high resolution image and then cut it down.But again, again, I've never done it myself so not 100% sure on that.Jeremy: When people talk about map sources, sometimes they talk about raster tiles and sometimes they talk about vector data. What's the main difference between the two and when would you use each? Sumit: Oh, the vector data itself, you can think if you are a front end developer, you can think of it like an SVG. So it's an image versus an SVG. An image has a fixed dimension width and height and if you zoom in, the quality gets lower. If you zoom in with an SVG, it's rendered by the browser so it has infinite scalability in a sense. And with, vector maps, it's the same. I'm not sure if that is a bit of a stretch to mapping experts, but, the good thing for me, if I use vector maps, you can zoom in and out very fluently. There aren't particular steps to the next image it's a smooth transition.And, especially if you use something like canvas mode in leaflet, it's a much smoother experience for the user. But. These are not individual images anymore. So I'm not talking necessarily about the tiles right now, but about the layers you put on top. So let's say you have markers of Tesla superchargers for example, and you put them on the map.These can be individual DOM elements. And if you have 6,000 of them, it creates a lot of load for the browser. But if you have something like canvas mode it's drawn inside the one element and everything is smooth and performant again. You have the downside of you cannot interact with the DOM elements individually, of course.But I might getting a bit out of your question right now because, vector maps itself, can be... If you refer to tiles, there might be even some additional advantages that I'm not sure yet. What I do know is that Uber, for example, created their own library also on top of open source products. I think it was on top of Mapbox. And they use only vector tiles like this because they have huge data needs and they need very performant maps. And if you use leaflet only just out of the box and you have big, big, big data then you might get into performance problems. Problems that I have myself but not yet solved in particular apps that I built.Jeremy: And when you're talking about these big sets of data, is this the mapping data in terms of things like streets and locations, or is this more overlayed on top of that?Sumit: That's definitely overlays on top of that. So I can do some examples. II'm working in the mobility sector of a car sharing company that's also how I got into leaflet 6-7 years ago because they needed a geospatial data management tool basically that I've built. And, the data that they use are, points of interest, electric charging stations. We have parking zones, drops zones for the cars. We separate the city into polygons to track demand in specific areas of course. And then you have other companies like (masque?), which is a local logistics company. They have zones where their ships drive and stop off the harbors. Tesla for example has supercharger stations, which is small data compared to that and ridiculously detailed data if we are talking autonomous vehicles. Then you need data like you separate the road into different lanes. For example, one goes in one direction, the other one in the other direction. You have parking spots and this for a whole city or a whole country, this is data that is so big and so much your browser would just crash if you try to display it in leaflet alone.Jeremy: Hmm. And so the very simplest case. If someone is trying to put on a list of parking locations or superchargers what does the API look like that for leaflet? Are people calling functions that are adding these one by one or is it syncing to a collection? What does that look like?Sumit: so there are multiple options in leaflet, which is quite cool. So normally what I do is I try to use the data always in GeoJSON. So I store it maybe in a different format, but in general, in the APIs I build around leaflet use GeoJSON normally and leaflet you can add GeoJSON simply.So there is a GeoJSON method where you just put in data and it displays it on the map. You can also create your own shapes. That means a circle marker, a circle, a polygon, a line, a marker. Then you basically give it two coordinates and it creates the shape on the map.With GeoJSON it's a wrapper for everything. So. If you provided a GeoJSON, it can be markers, can be polygons, can be lines, and leaflet will just add everything. It depends on your needs or your source of data that you have. And particularly with my library where I create the shapes so the user can draw the shapes, him or herself. I need all of those functions. Basically.Jeremy: And, you were saying that GeoJSON is a good option for storing the locations of different things, storing shapes. How about data that changes often, like if you have a car sharing application you might have live locations of where cars are. Would GeoJSON be suitable for something like that?Sumit: So I'm sure there are people that disagree with me. I had discussions actually with developers from Uber and also some, let's say, mapping experts from other companies, and the opinions vary. I can only speak for myself. even though JSON or GeoJSON, might not be the most efficient storing format out there, it's basically a standard in modern APIs to use JSON or GeoJSON as as the format to exchange data.And I like to have not a lot of transformations with my data, so I store it as much as I can in GeoJSON. It works fine for me. I don't have any restrictions or downsides for that? I don't build applications on a scale of Uber, but Uber also told me they store in GeoJSON and they don't have a problem with it either.They like to use open standards and I agree with that. And if we are talking live data it doesn't matter how frequent the location, for example, of the marker is updated. Let's say every half a second, you update the location of a moving car.You simply store the two coordinates into it and you can build it as a, you store the entire GeoJSON again or you store only the changed coordinate. Really the subset of the actual data doesn't matter how you build it. It's so small in comparison that it will only make a difference if you have, I don't know if you update a thousand or a million entries, all at the same time.Then of course, your network is gonna slow down a little bit, but this has nothing really to do with GeoJSON or JSON, this is any data. If you have so much data updates you should look at something like a message queue like RabbitMQ or Kafka or something like this. But if you do it just over REST API, I would personally batch the requests.So every second or every two seconds, I would update all the entries that have been changed. If we're talking about a front end or something and batch everything together.Jeremy: So to make sure I understand correctly, if you were getting the locations of cars, and you query the backend API, you would get a GeoJSON file, which is basically a regular JSON file that has a bunch of elements that might have the latitude and longitude of each car. And as you got updates, maybe you get an update every second or every few seconds. You would receive new GeoJSON files from the server and it would just have the elements that had changed or new elements?Sumit: So, there are two different, points in let's say an application stack, where you have these updates. The first one is the car sends an update to your backend. And if we are talking about our company, for example, share now, or, companies that use my open source product, these might be, for example, 10,000 cars always connected to some sort of backend, and they send their location, maybe every second. For example, let's say you have a network hiccup and they all reconnect, at the same time, you have a huge spike because 10,000 cars send their location at once, and this is an area where we at least had the experience... Don't use just HTTP use a message queue. So we can handle all of the data because the cars do not only send a location, they also send is the window down, is it up motor, start motor, stop events or everything that happens in a connected car is sent to a server.And this can be multiple events per second. If we're talking 10,000 to a 100,000 vehicles. Or scooters or whatever it is. This load should be handled by a message queue like Kafka or Rabbit MQ. Once it's in the backend and you just want just in quotation marks, want to display it on a front end then we are talking a different story here because, front-end doesn't have a message queue like this we can use if you want to do it really in a real time sense, you can use WebSockets, which is... Yeah, you send incremental updates as they come in to the front end and change your data and can be basically an ID and a new latitude and longitude and you connect the data and the front end.It can be new JSON. It's completely dependent on how you build it. I normally send some metadata with the JSON that is, for example, a name of the license plate, name of the car, maybe the model, is this a Mercedes or BMW or whatever. And maybe even some other data depending on the project, depending on the data that you get.There is maybe a lot of metadata involved. Then I don't send the entire JSON. Then I just send the updated information. But if you are not using a web socket, if you're using HTTP, then there is no push from the server. The browser has to fetch updated information. So in this case, and what I would do is, let's say I display Berlin on a map and we have taking as an example, share now and car2go here. Of course, can be applied to any company out there. If we are talking 2000 cars, for example, and I want to show them in real time, I will simply fetch the list of all the cars, every second or every two seconds. So there is no push or anything. I would just do a fetch a simple, interval request.Jeremy: And so you're doing this fetch on this interval, and is it taking into account what it chooses to display? It's just taking that entire document and showing everything that you're sending, or is it performing a diff, comparing the elements you already had and the new elements?Sumit: Yeah, that's a good question. So I do a diff, but it depends on the project. So I am building a little SaaS product around geo management basically. And there I use diffs. I would have, yeah, I did a mistake of not, not diffing. and the reason is if you redraw the layers on a map, it's a lot of performance overhead.And if you do this every second and currently you are maybe creating another layer on the map, you are interacting with the map in some other way. This, this is a performance problem. Also, the map constantly rerenders basically your data. And the map simply gets laggy. And if you want to do anything else on the map it gets laggy and it's not a good performance.So I use diffing that if out of 2000 cars only two move, then they will get updated and the rest is just thrown away.Jeremy: And internally. So when you're talking about using this diff, does that mean on the leaflet side or on the JavaScript side, do you have this GeoJSON document and then you're updating that GeoJSON document and leaflet just figures out that only those two elements changed and it doesn't touch the rest of the page?Sumit: Yeah, that would be great. But sadly, as far as I know it's not like that. So you give leaflet something to draw and they will draw it. So I do the diff, basically on the business application side, on the business logic side, I do this myself. I'm diffing, when I get the payload basically to the payload that is already there with lodash do a comparison, remove everything else.And, then I have to remove the layers first that are currently in leaflet, and then I feed them the new ones. And I also have my own IDs associated with everything. So I can compare also individual shapes on the map or layers on the map.Leaflet has IDs that they create when you add something to the map. But they are not consistent. So if you add a marker to the map it has an ID. If you want to add the same marker to the map again with a different location it gets a different ID. You can get the current one on the map and update that one. That is also possible, but then you would have to buy in completely into the leaflet logic, including the IDs and how to handle everything. And I like to separate... I like to have leaflet just as a dumb drawing tool and not as the source of truth for my IDs and business logic and all of that.So I do this outside. I do the diffing outside. I do the storing to the database outside, and leaflet is just a component. I feed it data, it draws it, and that's, that's about it.Jeremy: Mm. So in situations where you have live data and you have a lot of data changing, it sounds like you actually aren't using GeoJSON in that case, you have your own collection outside of leaflet. And you're using that along with your own diffing to determine, which elements you should find inside of leaflet based off of a key.And then, moving just the ones that you've found that have changed?Sumit: No, it's GeoJSON. It's all GeoJSON that I store. For example, the SaaS tool that I built has an API where you can fetch the layers that are displayed on the map, from an API And this is all GeoJSON. And then internally, I only use GeoJSON. And that meansif you put it into leaflet internally in leaflet, it's also transformed to something else, of course.But the map, you can just say, okay, give me these layers and make them GeoJSON basically. And then this is how my whole application uses it. but GeoJSON has a properties block, and in there you can add all the metadata you want. So in there, I have the IDs that are basically the reference to my database in there.I have, a gravitational center of a specific polygon, for example. I have descriptions, names, anything they use, I would like to see I store in the properties of GeoJSON and and use that everywhere. So it's still all GeoJSON.Jeremy: You have let's say a single GeoJSON document for all of your car locations. And if you go inside that GeoJSON find the key of a car that you want to move, you can update the location through the GeoJSON?Sumit: That's actually a pretty good question and just depends a bit on the application that you're building. So GeoJSON allows you to group layers together. That means if you have one polygon, cool, you can have that. If you have two or three or a hundred, you can group them in what's called a FeatureCollection.You can store it this way, that's completely fine. Then you have big payloads, but you have to get into the GeoJSON and get the specifics out of it. In the SAAS application, I have this use case to count and limit the layers that a user can draw on a map and this means I want this to be separate database entries.I can also choose to share one layer versus the other with the public or with the colleague or whatever and to have this granular permission system I need this to be separate database entries and that means... Each layer on the map is for me personally a different GeoJSON. But again this is highly customized to my use case.I'm sure there are users out there if they fetch my data from GeoJSON through the API. They would not like to have a thousand different GeoJSONs in an array or a collection for all the markers. They would just like to have one big GeoJSON file that owns all of the data.This is easy to do. You can just combine them. That's fine. And I will do this on the API level. But for my use case, I store them separately.Jeremy: Okay. So when we're talking about GeoJSON in a lot of cases you actually choose to have a separate GeoJSON document for each marker or each thing that you're going to put onto the map.Sumit: Exactly.Jeremy: We were talking about how you're bringing in data on the backend and then you're bringing it in as GeoJSON on the frontend.Is the GeoJSON, is that something that you're converting at the API level or are you storing your data as GeoJSON on the back end as well?Sumit: This is the embarrassing part of the application. When you start out, and I'm building this application for a long time and there is tech debt of course and that is one thing.So what got me up fast? I use Firebase as a data storage, or firestore it's called. And this just helps me ramp up an application quite fast. And I still use it even after many months, I think a year now that I'm building this application. And, in there, you have a limitation of the documents, how you store it.The document format is limited. So I cannot have nested arrays, but, GeoJSON has quite a lot of nested areas, especially if you have. Multipolygons. That means multiple polygons belonging to one layer, acting as one layer, basically with holes in it. For example, this does some multiple levels of nested arrays.You can't just store this in firestore. You have to store it as a JSON string. So basically this means is I store my GeoJSON as strings currently, that means on a database level I can't do any queries. Anything I do with the data I first fetch it via cloud functions or a node script or whatever.I fetch the data first. Then I do the calculations on top of it. For example, is a point inside this GeoJSON or are these polygons near the user or whatever? I do all the calculations and then I serve the data as GeoJSON of course. So I store it as GeoJSON in a sense, but it's a string.And if I would have to do it again, or if I look into the roadmap for the future, I would probably go to something like MongoDB where you can, I'm not sure if it's like native GeoJSON support, but the important part is that if you build an application that basically does a lot of of geo queries and stuff like this.If it's made to handle spatial data use a database that can do geoqueries on a database level. That means I give a database a coordinate like my location and do a query like give me all polygons. That where this point is inside a polygon this is something. If you can do this on a database level, this adds a lot of performance, because otherwise, if we are talking thousands or millions of layers, you have to fetch everything and calculate this on your backend, on the backend side.And this is expensive. So I would go to a database that allows geo queries, Firebase SaaS. They have this in a very, very limited format. I asked this on stackoverflow. I saw the question yesterday again. I asked this about two years ago, 2016 and back then they answered that they haven't exposed their geo queries yet and they still haven't.So I'm not sure when they come out with it, but if I needed I would switch databases I will not wait for that. And if I would start over, I will choose a database beforehand that allows me to do that.Jeremy: And for your work at share now, how are you storing the geo data there?Sumit: So, I have not touched that particular product in quite a while. But, back then when, when we built it. It was GeoJSON, but we stored, we used Mongo DB, and I'm not entirely sure if you can simply without any transformation store GeoJSON in Mongo DB, but Mongo DB has geo queries. So, I know that much, but I can't tell you right now off the top of my head if you have to transform it or if MongoDB accepts GeoJSON but transforms it internally that I don't know.Jeremy: We're talking about how you can have an application with a lot of data but depending on where the user is looking on the map they don't necessarily need to bring in all that data at once. What are the strategies you use to deal with that?Sumit: That is a good one. So what geo queries often allow you to do is if I look on a map you can define the borders by basically the top left, top right, bottom left, bottom right corner of your screen. So if I have a map on my screen leaflet and any mapping library can basically give you the boundaries that you are currently looking at.And these are these four coordinates. And if your database supports geo queries, you can basically say give me all layers that are inside these boundaries and then you can just fetch that data. This is how I would do it from the top of my head. Honestly, I've not built it like this because I've not had to deal with those big datasets.And there might be downsides to this. If the user for example moves the map how many queries do you do? Then you have a moving target, basically. But I'm sure there are ways to solve this. So this would be my first clue on how to achieve this.Jeremy: When you're working on a leaflet application, do you have any experience integrating a leaflet map with other frameworks like React or Vue?Sumit: That is a very good question. I get this quite often on the open source library as well. And with mapping libraries like leaflet you are basically back to the roots of HTML and JavaScript where you interact with the DOM elements directly.And with the introduction of frameworks like Angular and React and now Vue we moved past this. These are all abstractions to the DOM element. We say create this div and create this div. We don't do this anymore. But with leaflet you still at least a little bit have to do this in the sense that, for example, you tell it the div that it should render in and you call a specific element directly, sometimes through the leaflet API of course, but it doesn't have this reactive nature like our frameworks today do. So what happens is that there are a lot of abstractions to mapping libraries. So there is react-leaflet, vue-leaflet, all these abstractions, to use the mapping library in the same reactive way, in the same thinking mode of the framework, which, which is good for anyone that is starting out.I personally like to use it bare bones. I like to interact with leaflet directly with the API it provides. I have full control over it. Maybe I'm an advanced user in a sense. But the abstractions that are out there, they limit me because I always have to go through that. If there is a new feature or an edge case I want to tackle or something I always need the buy in of the maintainer of that abstraction and this is something I don't do.Sometimes I build my own abstraction but normally I basically build a component in Vue and it doesn't matter if it's React or Angular or Vue. I build a component that owns the map itself. And as a property I put the GeoJSON in there, for example. And the component does all the rest, maybe the diffing even, the rendering, the updating, the watching of the properties.But you have to build your watchers yourself and stuff like this. So it's just out of context. Not every developer is as comfortable with this, especially if you start programming and you started in the world of React so you're not used to this barebone coding.But if you use other libraries that are interacting with the DOM elements directly then it's a good exercise to connect these two worlds. Another use case might be where you could use your skillsets, in the sense are, charting libraries. So if you display charts for example on a dashboard, it's the same, same topic there.Jeremy: So basically you recommend keeping your framework code separate from your interactions with leaflet. You'll use leaflet's built in APIs, not use a wrapper just because you want to make sure you have full control and not be limited by that wrapper.Sumit: Exactly. And if there is a new version of leaflet coming out, I want to use it immediately. It probably has some updates that I want to put in. And again, I can control the user experience much, much better if I can interact with leaflet directly through the APIs.I wouldn't necessarily say I would recommend it. It's the way how I do it. And I think if you build an advanced, a big application, I think you are better off using the leaflet API directly. If you just want to display a map and maybe a marker on top of it to show the location of your business on your business website, then it's totally fine to use a wrapper, right?That's easy. You will have it done in less than an hour and that's it. But if you build a big application to handle geospatial data. Then it's a different story. I would say remove the abstraction and go directly with the library.Jeremy: For these components that are from your framework do those end up rendering surrounding the map? Like you would have a sidebar or a top bar or something like that?Sumit: Yeah. Also also a good question. So on Geoman, I use it. The SaaS product is also the demo website basically for the open source library, and it's called geoman. And there, I have a map component as I said, and it's basically fullscreen. From the Dom site. Of course, I have stuff around it.I have a sidebar in a sense and a footer and whatnot. But with CSS I just display it differently. So the map is for me, always front and center. It's the biggest element. And it should be basically like Google maps. It should be the one big background thing and the header and the sidebar, just like on top of the map.And you can build this in different ways. Like you could even add this to leaflet you could do everything in leaflet, you could even add your sidebar as a map element on the map but I don't do this. I have the map container and I just overlay my other Dom elements on top of the map, so I want to keep this really separate to the map itself. The map component is definitely one separate component for me that I interact with. Yeah. Because of lock in. Again, let's say I want to change mapping providers at one point instead of changing one component, I would have to change I don't know the whole application. So I try to keep this as separate as possible.Jeremy: So if you were to look at the HTML the DOM nodes that have all your controls are your sidebars, things like that, they would be completely independent of the leaflet DOM node, but you use CSS maybe with a fixed position or something to appear on top of the map, is that correct?Sumit: Yeah, that's true. I would do one exception though. I build an open source library, as you mentioned, for drawing, basically layers on the map, markers, lines, whatever. These buttons I add them to leaflet. So because it's a leaflet plugin, that means any user of leaflet that wants to use my plugin just imports it.And it adds these buttons to the leaflet map. I basically used the leaflet framework to display this, but anything else, like for example, an address search, the list of the layers that is currently on there, an export button for downloading the GeoJSON, changing the tiles from street view to satellite view, this stuff you can all add to the map.And a lot of people do that, but I personally keep this out of the map. I have this separately again, because In my SAAS application, I would like to have more control over the user experience. And maybe I have something like a pay wall in front of it or an upsell button or I need to limit this based on user permissions.And I simply feel more comfortable doing this in a framework like Vue instead of leaflet itself.Jeremy: You were talking about how you built a plugin.Can you talk a little bit about what makes sense to be built as plugins versus built outside of them?Sumit: Yeah. Also very good question. I thought of that as well. Sometimes my thoughts went if you have the stuff I just mentioned, an address search and the list of layers and stuff like this. I thought about, Hmm, what if everything of that is a plugin for leaflet, I could open source everything. I just add everything modular to leaflet. It's a big lock in into the leaflet ecosystem. So this is all possible and there are plugins, I think for all of that, there are plugins for your address search, there are plugins to switch tiles, plugins to draw, to export data for all of that.There are plugins already and they add this to the leaflet map. Again, I think if you have a a use case of just having a small map showing some small stuff, or you want to create an MVP, meaning a small app that can do something where, yeah, it's not a product that you might sell, to someone.You can do all of that and it will be completely sufficient and it's probably less work. To just use a plugin to add this to the map. But, if you have a product and you want to control the design aspects of it a lot and maybe animations and stuff like this. I think for every developer it's more convenient to not fiddle in leaflet code for that. Leafet has its own CSS also. So you have to overwrite this or compete with it over the style sometimes here and there. And keeping it out of it just means the data that is displayed and sent between these components has to flow through somewhere. And if you build a big application, you probably have an application state, a state, somewhere like in react, what is it called? Redux, for example. Redux store where you have your data store on the front end. If you use this as your source of truth, you can use it for all the components that you display on top of the map that are not inside leaflet. And it's much easier using that ecosystem then recreating everything in leaflet because if you do that you basically don't need a framework anymore.You build everything in the leaflet library and you interact with the DOM again directly. So the abstraction that we mentioned before you would have to build this for basically everything, right? And there is no reactive thing anymore. You don't get the benefit from React anymore if you add everything directly in leaflet.Jeremy: If you look at the leaflet website, there's a long list of plugins. Is there a set of plugins that you typically use when you know you're building a leaflet application?Sumit: No. Somehow I'm ignorant in that sense. No, not even the address geocoding thing. I also built this on my own. There are many, many, many good plugins there honestly and what I can tell you is there are plugins that we use at SHARE NOW or Car2Go and there are plugins that I constantly see people use together with the library that I am providing and these are something like, what's it called? A marker grouping tool. It's...Jeremy: Clustering.Sumit: Clustering. Yes. Thank you. Marker cluster is a plugin that is used a lot and we use this as well. As the name implies it clusters the markers when you zoom out, so you don't have 10 million markers on a map, but it clusters them together.It's also good for performance, but also for usability. so this is something that is used a lot. And then of course everything that creates heat maps and data visualization are also used a lot, at least from the circle of open source maintainers or open source users that I'm interacting with a lot.But honesty, if you are using leaflet, look at the plugin library, see what is there before you build it yourself. Try the plugin and see if it fits your needs because you don't have to reinvent the wheel. For me, this might be a bit of a sickness and it's fun to recreate the wheel for me.But of course it's a waste of time if you have it already there.Jeremy: Yeah. When we talk about displaying things on the map. We were talking about cars earlier. And you're saying you display them in the form of markers. A lot of times applications want to show information that's associated with those markers, whether that's a label on the map or something else like that.How do you typically approach that? Keep all that information grouped together and keep it updated.Sumit: So I saw different implementations. You can, as always, the metadata that you store with your layers. You can choose to have them its own entity outside of the GeoJSON, or you add it to your GeoJSON. And so the question is, is the marker itself, your source data, your main entity that means... do you, do you associate the style for a marker with the marker or do you store the style separate and store the idea of the marker in the style that is basically an architecture question that you have to solve. An example that might resonate maybe more with everyone is we have cars, right? So we have cars and the cars have metadata like license plate, which tires they have, which SIM cards they use, they have all this, a lot of data, much more than the geo data. And so is the location of the car just the property of the car? Or is everything is from the car metadata of my location data?And this is a distinction you have to decide on your own, how your complete architecture, landscape and ecosystem looks like. For me, I decided many times that the location data is the source of truth, let's say. And I put a lot of metadata in. That is because if everything the user does and everything my API does resolves around the location data I can just as well as use that as my main source of truth.So if we go back to styling. Styling in particular I would store as metadata. So you have a marker and it should resemble a car. So I want a different icon. I maybe want a direction. If we are talking apps like Uber for example, you see the direction the car is heading. So, I might have a degree so I can, I know how to turn the icon on a map. I need a color, and maybe I wanted to display the lane that the car was driving, the route. So I want to display that and all of this data I store as metadata. And if the user changes that, I change it in the metadata of that marker. That means if any team is fetching the data from the API they have also the information how to display it. And this was for me and for us also very important to do. So styling, I would definitely recommend storing this with the geodata as metadata with everything else. Like the fuel level of the car. That decision you have to do on your own architecture wise.Jeremy: If you wanted to make a customizable user interface in terms of somebody chose, they wanted to see what's the speed of the car or something like that and they want to be able to turn it off and on, you would store what the user chose in the same place.Sumit: What the user chose in the sense of, okay, I want to see the fuel level. I want to see the license plate.Jeremy: They want to be able to turn it on and off.Sumit: Yeah. So that is a big use case. I think that's the use case for everyone that displays markers. Let's use Tesla superchargers for example. You have the location of the supercharger and if you click on one, you have the address, the exact location, how many are there? How many are currently free? Blah, blah, blah. All of all of this data like you can you, can you eat there? are there restrooms, et cetera. So Tesla, they don't give you the data as GeoJSON. They give you a collection of superchargers basically an array with objects. And just the location of the supercharger is a GeoJSON.That means you have a big collection of JSON data and a subset property of each entry is a GeoJSON. That is also how we do it at Car2Go and SHARE NOW. GeoJSON is just a part of a bigger, dataset. But in Geoman, for example, I reversed that. I use GeoJSON as the main source and have.Basically most of the data as meta data inside the GeoJSON, it doesn't make a big difference to anyone. It's just a preference of how you manage your data. And again, what is your main entity? Is your entity a car that has a location property, or is your main entity a marker and being a car is just a property of that marker, could also be a plane or user.Because I wrote an application where everything revolves around location data, I chose the location is my main entity and that is the GeoJSON and the fact that it's a car, it's just metadata, a property of it with a specific icon basically, like my application doesn't care if it's a plane or a car. At the end of the day.Jeremy: I guess another thing I'd like to ask about is when people use maps, a lot of times they're on their phones as well as just being on the desktop. what's your approach or what are things to watch out for when building a site that needs to work on both desktop and mobile?Sumit: Yeah. Very tricky. Very tricky. So, two things, if you just want to display it. So again, the use case of you have a location of a business or something that you would like to display. That is no problem at all. Leaflet is very mobile friendly, and if you just display a marker the user can zoom in and out, can easily scroll through.So if your thumb goes on the map and you scroll, you might have seen this with Google maps right in Google maps to zoom or to move the map, you need to use two fingers. They do that so you can easily scroll through the website without, You know that you are scrolling is interrupted by your moving the map instead.So these kinds of functionality to make it mobile friendly is there in all of these libraries. So this is an easy thing. You don't have to do anything. It just works out of the box. But if you create more advanced interactions with an app, like for example, a user should be able to draw a polygon.Then it gets more trickier and geoman and also my library leaflet-geoman they both work on mobile. I think that is also one reason why it's getting quite popular because from my research, it's the only drawing tool that works on mobile. But if you use your thumb, it gets in your small screen, it gets less precise your drawings. So I have features like pinning. That means if you click near a different marker, it just snaps them together. So it assumes you want to place them on top of each other. That's a really cool use case, especially on desktop where you can easily move to markers. But on mobile, it's not that easy.So. I personally think it's possible, but it's not a good way to do that except for let's say you have an iPad Pro with a pen. Then it's really cool. I was really impressed when I used my own library for the first time on an iPad Pro with a pen, because then you can draw really precisely. It's a lot of fun.It's easy to do. that's really cool. So if you have advanced interactions with mobile. Then you will get into situations with click events and stuff like this where it gets maybe a little bit more tricky and use cases where you hover the mouse over the map and you display things or you give the user a hint of what will happen when they click.You can't do this on mobile. It's basically a different experience and in a sense a limited functionality if we are talking about drawing, So this you should keep in mind. If it's strictly about displaying data, you will not have a problem.Jeremy: And in terms of the UI, it may need to be significantly different, right? Between desktop and mobile, do you effectively, hide a lot of UI elements or build two components depending on what size the viewport is?Sumit: Yeah. Yeah, of course. So in Geoman, I use the sidebar for example, right? If you display it on mobile, you don't see the map anymore and it's just a lot of data. So the more data you show and the more metadata you want to display the trickier it gets. How to display this on mobile. And the map should be front and center, especially on a tool that revolves around that.So that means if you open geoman.io on your phone, you only see the map and you have a small icon on top that moves in the sidebar, for example. it's still not perfect. I think it's a limited use case also, my particular app to use this on mobile. But of course, you have the same problem with any website that displays a lot of data.You have to hide some, you might have to, basically also reduce some data like some data that you just don't need on mobile. you will reduce maybe the table, columns, you know, stuff like this to make it as easy as possible for the user on mobile. And they can still open a desktop view if they want to on mobile.But it's such a limited use case. I think that, as long as it looks good and it has the basic information, I think everyone is fine. If you go into an advanced mode than a desktop might be, or a tablet might be the way to go for the user.Jeremy: Yeah. We've been talking about geoman. And you've mentioned that you have a leaflet plugin called geoman-leaflet. Can you go into a little bit about what geoman is and why you decided to create geoman-leaflet?Sumit: Yep. Yeah, sure. So just so you know, it's not a superhero or something like this, it derived from geo management, and I just noticed later that it sounds like some sort of superhero. Anyway, I have a quite, let's say, experience now in solving this for companies and seeing the problems inside companies.And not only in the mobility space, this ranges to communication providers to, again, construction sites and logistics companies and everyone that has, geospatial data. There are two categories I would say that I'm trying to help. One is they have their own application, they have their own team that works on this or multiple teams, and they use open source products or pay products and they need more functionality.So they probably use my open source library if they use leaflet for creating data. And, the open source library basically helps you create and edit this data. So you need to create a polygon. You can create this of course by just providing the coordinates. But if you have a user and he needs to, for example, create a polygon around a building or around a block or a city or whatever, my library is there to help them just easily create these polygons on the map.They can create markers, circles. rectangles, whatever it is. They can also edit each specific Vertex, add Vertexes in-between, they can move the entire shape. They can cut holes in polygons. They can of course, remove the polygon. So all of these drawing and editing features are there. And one product I'm currently thinking about, I'm just collecting feedback. I've not built it yet, is a pro version of that that has even more advanced drawing and editing features. Some companies need to have polygons that are adjacent to each other and cover a complete area like a city, and to have maybe a hundred or 200 polygons that make up the city.So just think of it like sales territories, or something like this. And that there should not be an overlap or a hole between these polygons, so they need to stick together. Now they can do this already with the open source library that I'm providing, but if they want to change the shape of one polygon, they would have to change all of the adjacent polygons as well, which is quite a lot of work.And I could build a tool, for example, where you just draw a new border and it automatically calculates. The new size of each of those polygons and adapts that because you basically tell them, Hey, I don't want any overlap and I don't want any hole in it. And these are such advanced niche drawing tools that require weeks and months of work to build, that I want to wait if people or if companies need this, and I'm thinking about providing basically a pro version of the open source library. So that means I have the open source library that I could constantly maintain for basic drawing and editing needs. I wouldn't say it's basic, it's quite advanced already. Then we have really advanced tools where I think a pro version of the library would be helpful for some companies.And then we have the second bucket of companies that don't build their own application. They just want a way to easily have a service like MailChimp, for example. Instead of managing email, they want to manage geospatial data. So they want a service where they can create data where they can put data in through an API, and where all the teams and clients and apps can fetch the data again and this is where the SaaS application comes in and it's called geoman. There you have a studio where you have your map. Or you can create multiple map like projects, and then you create, for example, one map that displays your supercharging network.You have one map that displays your sales territories or your parking spots. You can create one map that displays airlines, routes, whatever it is, whatever your data is. And for me, the powerful thing that would solve a lot of problems inside companies is any user can draw this.It can be a working student that creates parking spots, for example, in a city. Then you have your developers that not only consume the data, but they can write the data through the API and it's updated everywhere on every consumer. And you can attach the metadata to it to put everything in.Jeremy: To summarize, there's geoman, the application, which is where somebody can draw shapes or add data, probably import data from something like say a CSV or an API or something like that.And then, if you had an application that wanted to consume the data, geoman itself has an API for people to get data back out.Sumit: Exactly. Jeremy: and then geoman-leaflet is the leaflet plugin that you built and use within the geoman application.Sumit: Exactly. And it's open source for anyone that wants to build basically their own application to manage your data. And, yeah, so I got asked a lot, for example, why do I open source this? Because it seems like it's one of the best plugins out there. I don't want to, say anything false here.And I'm not sure about that, but users tell me they switched to it, for example, from other libraries. And I also haven't seen other libraries that provide the same functionality. So it's one of the more advanced ones. And people ask me, why do it open source? Why is it not part of the geoman suite?That makes it a reason for people to use. And the reason is simply that I know that Geoman will not cover all use cases that are out there. People need very specific needs and geo data. And I like open source. I love contributing to it. I love the interaction with the community and I think it's a great open source product.I personally benefit a lot from open source as well specifically with leaflet and I thought Geoman as a platform can be a way for me to earn money and invest more time into the open source product that I can then give back again. You know what I mean? I've maintained this library for four years now. I have not earned a single Euro with it and it costs a lot of time and I feel bad if I don't maintain it for a month or something. But if, if the platform is successful, then I have more time to develop it. It serves multiple purposes, not only for me personally, but also I can maintain the library much better, add much more features for anyone that wants to build their own application with the open source toolsJeremy: Very cool. So hopefully at some point you'll get to spend more time, in your day job working on geoman and working on geoman-leaflet.Sumit: Would love to. Yeah.Jeremy: To start wrapping up, for people who are learning leaflet are there common mistakes you see people make or suggestions you have for people who are learning leaflet now?Sumit: Yeah. If you go through the docs and the standard stack overflow tool that we all use every day you will get quite far. What I see a lot, especially if you use frameworks, and beginners with leaflet, they struggle sometimes to make this mental distinction between these two universes, these two ecosystems.So if you use a reactive framework, like react, vue, Angular, just know that leaflet itself is not part of that. You interact with the DOM directly as mentioned before. So don't expect like this reactivity. This you just provide something and everything changes on the map.This is something where I see a lot of questions happening and then on the other hand, the business side, so think of leaflet and mapping. Like they provide you the tools to display a map and data on top of it, but the business logic behind it, like, how do you store it in what data format or I want the tile to be red if it's overlapping something else, this stuff, you have to build yourself or use a plugin for it. It won't create business logic for you. It's basically a dumb way of rendering data. Not that dumb, but I hope you get what I mean right? You have business logic that is specific to you. And you have to build this yourself. So the library, will not do this out of the box for you. So I see these questions a lot also on the open source repository, where people just expect expect it to do things. but they can easily do this in three lines of code they just haven't wrapped their head around yet. What exactly is leaflet doing for you and what not. So if you have any business logic this is something you have to build yourself and it's honestly quite easy. But if you have any questions I'm happy to help, not only for my open source library, I'm on Twitter and everywhere, so I'm happy to help out.Jeremy: Cool. Are there any specific projects that people should look at. If they're trying to learn how an application should be laid out in leaflet.Sumit: So there are multiple demo projects for basically all the plugins. There are demo projects and also for leaflet itself. You can look at geoman.io, which is quite a, let's say, more advanced use case. Everything from Uber of course is very advanced. And especially in data visualization, they have amazing tools and it's all open source also.So if you look at demo pages from Uber and from Mapbox you will get a sense of where it can go. And then I would personally just look into the DOM and see what they do and how they do it. Maybe they have an open source repository where I can take a look.And also, for example, if you want to write your own plugin with leaflet, look at the open source code. That's how I started. Like I had no idea how to create a plugin for leaflet or how to manage geospatial data, like I just didn't know. All I had was I needed functionality that no plugin had so leaflet-draw was back then the only one. I basically scan their code, looked at how they do it, and try to recreate it and then build my own architecture with it. But it was very similar in the beginning the code. So just look, that's what open source is there for, look into the code, learn from that, clone it, adapt it, and grow from there.Jeremy: Yeah. How about in terms of like a full application, you're mentioning to look at, say geoman.io but geoman itself is not open source, is it?Sumit: No. Yeah, so I'm not sure where are to look there. what I do sometimes. Okay. You have stuff like geojson.io. it's a small utility to, to edit GeoJSON. I've built the same on geoman.io. Oh. But yeah, it's a different one from Mapbox and this is open source. It's a smaller, application that, yeah, that you can take a look at.What I can also recommend is if you for example you want to create a leaflet map and you don't know how. The demos are not enough for you. You want to see some real use cases with React for example, what I do is I used the advance search functionality on GitHub. So I look at who uses leaflet and who use react, and maybe I searched for the inception code from leaflet, and filter by it.And then you find a lot of applications that basically use it like this, and then I scan around and try to find someone that uses the same stack as me and how they do it. And this gives me a lot of inspiration.Jeremy: Yeah, that's a great tip. Not just for leaflet, but for anytime you're trying to learn a new library. Yeah, it's really helpful.Sumit: Yeah. The GitHub search functionality is underrated, I think in that sense.Jeremy: Cool. well before we finish up, is there anything that you think we should have mentioned or we should have talked about?Sumit: I think we had a quite a awesome overview. There is not much to mention if you are into geospatial data or if you have the problem to solve the problem, for your company, or a client or whatever. I think it's quite the interesting field. It's quite niche also, but yeah, the user experience is... It's amazing, and you have such a big impact. If you build something nice on top of maps. And of course it's a field that is very, very future-proof. It doesn't matter if we're talking drones, autonomous vehicles, sales, like everything in the mobility sector specifically, but everything around us needs more and more location data because everything is connected. And, I think it's a field where you as a developer, if you have these skills are very good equipped for the future. So, don't hesitate to get into it and code a bit around it. I don't think you will regret it.Jeremy: Cool. And for people who want to follow you or see what's going on with geoman, where should they go?Sumit: So, I'm on too many platforms, I say, but, I'm active, very active on Twitter. my handle is tweetsofsumit. There I'm the most active. So if there is anything you would like to ask, or if you want to follow me and you know, see where geoman is going, the open source library or also even how to create a business, like I'm not an expert in it. I just share my journey. I will share everything on Twitter. And if I have a, for example, a YouTube video or even a guest appearance on a podcast like this one, and I will share everything there. So I think that is the best way. And of course, you can go to my website, which is raum.sh raum is a German word, raum for a room. raum.sh Is my personal website where I also post occasional updates here and there.Jeremy: Very cool. Well, thank you so much for talking to me today SumitSumit: Thanks for having me, Jeremy it was nice talking to you.
undefined
Sep 23, 2020 • 1h 3min

WebAssembly on the Server with Krustlet

Taylor Thomas is an Engineer at Azure, the core maintainer of the Kubernetes Package Manager Helm, and a member of the Krustlet team.Timestamps[00:55] - Kubernetes[07:37] - WebAssembly[12:06] - WebAssembly Runtimes and WASI Specification[15:42] - WebAssembly vs Containers vs Native Binaries[25:11] - Krustlet and the case for writing it in Rust[30:52] - Missing APIs in WASI[33:38] - Wascc vs Wasmtime runtimes[38:15] - Rust ecosystem for Kubernetes and WebAssembly[40:23] - Comparing other languages to Rust[45:09] - Rust learning curve, experiences as a beginner[53:16] - Next steps for Krustlet and WebAssemblyRelated Links@_oftaylorKrustletKubernetesOpen Container InitiativeWebAssemblyWASIWasmtimewaSCCWebAssembly meets Kubernetes with KrustletIntroducing Krustlet, the WebAssembly KubeletKubernetes: A Rusty FriendshipThe Safety Boat: Kubernetes and RustA Heaping Helping of StacksThis episode is also posted on the Rustacean Station feed. Check it out for episodes all about Rust!TranscriptYou can help edit this transcript on GitHub.Jeremy: [00:00:00] Hey, this is Jeremy Jung. This episode, I'm talking to Taylor Thomas about running WebAssembly on the server with Rust. He's an engineer on the Azure team. The core maintainer of Helm, which is a package manager for Kubernetes. And he's currently working on Krustlet, which runs WebAssembly applications within Kubernetes. Also, during our conversation, you're going to hear us talk about WASM. That's shorthand for WebAssembly. All right. I hope you enjoy my talk with Taylor. Taylor thanks for joining me today.Taylor: [00:00:30] Thank you for having me, Jeremy. This is my first Rust related podcast so it's exciting for me. I'm a fairly new rustacean all things considered, so I'm happy to be here.Jeremy: [00:00:41] For people who aren't familiar with the world of kubernetes and containerization all these different things. Could you start by explaining at a high level what Kubernetes is?Taylor: [00:00:55] Yeah, so Kubernetes, is a container orchestrator. And you've probably at least heard of Docker if you're in the technology world even if you haven't used it. But Docker is a technology that made something called containers popular and useful. Those technologies have been around for a while inside of the Linux kernel that's why they call it a container everything is inside of this thing and contained in this process. It uses C groups and kernel namespaces. There's a couple of different things under the hood that are going on. But basically it allows you to create an artifact that can be bundled up into something called an image.And that image can be passed around and then be used multiple times. So often people will compare those to VMs. VMs were a big revelation, right? Because you didn't have to go literally put in a new blade if you needed a new server or reboot something you had to instead just say, okay, I want a new VM. But you still had to install a whole operating system. It was like spinning up a new computer. And so containers made that even more simple because instead of having to do that, it's using the same shared underlying, kernel calls and things underneath the hood, but everything's isolated. And so if you want to spin up three different instances of an nginx server all you'd have to do is create three containers that are all running at the same time and those will all have the same specification that you basically baked into there. An immutable artifact. If you want to create a new one it has a new hash and a new version. So this was really good for people, but the problem is how do you orchestrate it across everything?And that's where Kubernetes came in. So Kubernetes takes that and says, okay, well we have a huge fleet of nodes. That's out there. How do I schedule each of these containers properly so that they're either not on the same place or that they meet certain requirements or that I have a certain number of them.It takes care of all that including some of the underlying networking connections so that you can connect all your containers to each other in a distributed and actually self-healing way it goes through loops. And if something goes wrong, it will try to heal that container and make it come back to a normal running state.And so it is a very powerful technology it's caught on... maybe it's caught on a little too much in some people's opinions, but it's a useful tool for containers that came about and a lot of people are using it for underlying infrastructure projects.Jeremy: [00:03:16] It reminds me a little bit of when you're using something like AWS and it has auto-scaling for VMs. And let's say that you have an application and it runs on virtual machines. And you would be able to tell AWS as more people are accessing my application I want you to create new VMs [and] run new instances of my application.Something like Kubernetes is able to do something similar, but maybe at a more generic level of being able to figure out. Okay. How many machines do I have access to? And anytime I'm asked to run something, I'll go and find the right machine to run it on. And it's always going to be in the context of these containers. Did I kind of get that right?Taylor: [00:04:15] Yes, it's very much a generic tool across all these different things. And nowadays you can run it with, with windows or Linux or mostly windows and Linux. But there are some limitations there, but yes, it's a very generic tool. In terms of being able to connect what you want and use what you want to, to make this distributed platform easy.And like you said it will scale things. If you've done something in AWS where you have the elastic things where they'll scale it's a similar thing. There's something in Kubernetes that you can set up that it will automatically scale to make sure that you have the right around right amount of capacity for what you're doing.Jeremy: [00:04:50] And I think this takes away the need for the developer to need to understand where their applications are going to run. It's. You give some kind of configuration to Kubernetes and say, here's my app. And then it just figures out, where to run it and how many instances to run and that sort of thing.Taylor: [00:05:12] Yeah, it's really meant as a dev ops or SRE kind of tool that instead of it being a. they have to like custom tailor machines. They already have these Docker images in place and can just run them, like you said. you just give it the configuration and it is still involved. It's not like a magic bullet there, but you don't have to care as much about where it goes if you've configured everything properly, it just kinda does it.Jeremy: [00:05:36] I wonder if you could paint a picture of, where containers fit in between just running a full machine versus running a single process on a computer.Taylor: [00:05:47] Essentially a container is just a running process, a single process. And because it's a single running process, you can run multiple of them on a machine. The tools that you need installed, all the binaries, all those different things are encapsulated inside of this container.And so that way you can run 10 of them on a machine instead of spinning up 10 servers. So it allows a little bit more density, and obviously you still have to be careful about what noisy neighbor problems and other things that normally happen in infrastructure, but it allows for much more condensed areas. And the spin up is much quicker. If you have a small image that you have to get getting a new one and spinning it up is easily under 30 seconds every time. bigger images can take longer, but even then it's still faster than provisioning a full VM for what you're trying to do. And so this fits for a lot of different kinds of services that, don't need like very, like large, large requirements on the system.And even if you do have large requirements, you can use Kubernetes in a way that can be helpful.Jeremy: [00:06:45] And so you're talking about how these containers are a process running on the system. So for example, if I were to in windows, look at the task manager or in Linux, just run ps, I would see individual processes for each of these containers that are being run.Taylor: [00:07:05] Yes. It really depends on the implementation, but essentially that's, what's going on. There's with Docker, there's some other underlying details, so we don't need to get into, but essentially you would see processes that, that are spun up that are, are doing the work, but they are just processes underneath the hood instead of being a full operating system.Jeremy: [00:07:23] Next, I want to talk a little bit about, WebAssembly, because I know that's an important part of the krustlet project that you work on. Could you explain a little bit about what WebAssembly is at a high level?Taylor: [00:07:37] Yeah, so WebAssembly, as you can probably guess from the name is a tool that was originally designed for the web. Now, that's I think the original creators didn't necessarily intend it to be that way, but that's the name that it has. And that's what it's used for. There's multiple, companies and, big websites use WebAssembly. And the idea behind WebAssembly is that you can create a binary that can then be consumed. So it's compiled code that can then be run in the browser. So giving you the speed of compiled code. While also, still being sandboxed inside of the browser sandbox environment.So this allows for some very performant things to be done. You have things like rendering tools. I know one that I've seen an example I always give is one. I like Autodesk, which they're maker of CAD tools and rendering tools. They have a lot of online things that run WebAssembly. Because it's, it gives it the performance.It needs to be able to do those more complex tasks. And so that's where WebAssembly started, but the idea is WebAssembly could be used anywhere. And when you pull out a WebAssembly from just the browser, it allows you to have kind of a universal interface and what this is called, they've defined it. And there's a working group and it's still very much a work in progress, but it's called WASI, which stands for WebAssembly system interface.And that interface is a definition of basically how it can interact with the system. And the nice thing about it is it still that sandboxed model, you have to grant explicit permission to do anything. So if you want it to be able to access a file somewhere, you have to grant access to that file ahead of time before you start it, It's not there yet, but I assume when we get to some of the networking socket support in there, it will also have those. You have to grant specific permission for it to do it. Whereas opposed to containers, you have a little bit a different security model because you're still running like into the normal Linux security things that you have to deal with.there's ways to break out of a container. There's not really a way to break out of a WebAssembly module in the same way. because. You, you're not, you're not running in these, like you're running a compiled code thing somewhere instead of basically like a shim over specific binaries or things that are being run.so that's, that's the main difference, and this allows things to be run on any system. one of the things that we, we saw with Docker and that we were really hoping when, when we, when we had that Docker was that it could run anywhere. But if we're being honest with ourselves, it can't really run anywhere Docker and containers in general are a Linux tool. Now there's people, well, who at Microsoft and elsewhere have made Windows containers, a thing, they work, they work well there's some really cool work they've done. And there's nothing. I have nothing to say against that. They're they they've done great work, but, but really if you have a nginx container, You can't run that on a windows machine.You can run it on a windows machine, but it's technically running a Linux VM behind the scenes, same thing on a Mac. And so it isn't truly this, ideal of write once run anywhere. Now, obviously there there's still technical challenges there. You can't do it completely, but you can come. So if I compile a WebAssembly module, a WASI compatible.WebAssembly module on my Mac. I can pass it to you and you can run it on a windows machine, on a Linux machine, on a raspberry PI. It doesn't matter because, and it'll be the exact same code. so that is a very powerful thing that we saw in WebAssembly that fits also fairly well inside of this container space and what we could do with it.Jeremy: [00:11:12] Help me to understand a little bit more about what WebAssembly actually is, because does WebAssembly itself have byte code or some kind of language that you're, you're passing to? a runtime, like for example, an example I can think of is. the Java virtual machine where people can write many different types of languages, such as, I believe Clojure and Scala, they can write a language that generates Java byte code and is run by the Java virtual machine.And so you can run those applications anywhere that you can run the Java virtual machine is WebAssembly similar? Is there a language for WebAssembly that is the target for say rust or C or different languages that you want to run in WebAssembly?Taylor: [00:12:06] That's a really good question. It is similar in the sense that it does write code that then can be interpreted anywhere and there are various WASM runtimes. The reference implementation that we're using within the Krustlet project is called Wasmtime. And that's the one that's following the Waze spec, and is essentially the reference implementation for the WASI WASI specification.And, that one can run on, on any of these systems that you've mentioned, and it has a byte code that it interprets in that way. And there's very, there's, there's a lot of technical details we could dive into there that I'm not a huge expert on, I know the basics, but there's some that are just like JIT compilers, right?So just in time, there's some that are like, compiled at, like a pre compilation step that can happen before. all of these things can happen but the thing is, is this is much more lightweight and small and constrained than the Java the JVM would be in this case. Right. So it's a, it's a much smaller and compressed use case, but it does have the similarity that it is byte code being interpreted by some sort of runtime.Jeremy: [00:13:09] And, and this, this byte code, cause you've been talking about how there are different, I guess, implementations of the WebAssembly runtime. You, you gave a Wasmtime as an example of that. Does that mean that as long as you target, the WebAssembly bytecode that any of these different runtimes could run that code?Is, is that how it works?Taylor: [00:13:35] Yeah. And that's why the WASI specification is kind of the thing that's making it work on the server side rather than just in the web is because we need those specifications for how to interface with different things on the machine. and so there's also this other thing called interface types, which also defines like how you can interchange different data types in between different parts of applications.To be clear. This is an area that's still under heavy development, for example, Wasmtime and WASI in general, hasn't even finalized their network specifications. So you have to kind of do work arounds and things to get net working in place. Now that's coming this, but like, this is very bleeding edge in that sense.Jeremy: [00:14:17] And so this bike code can run in the browser in whatever WebAssembly runtime is built into the browser. And then it can also run outside the browser in a runtime, such as Wasmtime. And it sounds like the, the distinction there is that. When you're running outside of the browser, you need some kind of consistent API to be able to access the file system, access network, things like that.which would normally be handled by the browser, but once you take it outside of the browser, you need some common interface that knows how to make system calls to open a file on windows versus open a file on, on Linux. And that's what a runtime like Wasmtime would do by implementing what WASI describes is that sort of, did I get that right there?Taylor: [00:15:10] Yeah, that's a, that's a great summary of how this works.Jeremy: [00:15:14] Cool. so one of the questions that, I think people often have is when you work with languages like rust or, or go, they can be compiled ahead of time, you can get a binary that you can run directly on your operating system, for languages like those what are the benefits of using WebAssembly to run an application versus just using that binary?Taylor: [00:15:42] It just depends on, I guess, the situation you're trying to do with it. Like there's certain tools, like the idea is what we have with, with WASM right now, isn't meant to replace Docker entirely, I think it opens up new cases for people who aren't, who aren't like already into Docker or need to, or have some specific things.But also this makes there, there's a certain amount of portability and security that comes from using WASM. obviously if you build a binary and then you build it for each system, you have the, the native things all built in all ready to go, which has a distinct advantage over some, like having to work through an interface, however, that security model is what gives us, a lot of hope for the future of this, because you have to explicitly grant permissions.and it's a compile once. I don't have to compile my windows version. I don't have to. So you don't have to worry about cross compiling tool chains or the different VMs that you have to spin up to build one for ARM and for Mac and for Linux and for windows and you know, all those, all the different targets.You don't have to worry about that with wasm you just build the, build the one binary and it's ready to go. it's also very, very small. If you do some, even without optimizations, you're talking like a meg or two megs for for a simple application. As opposed to a full binary, when, when you build it and rest binaries are fairly small, go, binaries are larger cause they've compiled in the runtime, but like rust, even though their binaries are small, this is even smaller. And if you strip it out, you can get it under a meg, depending on what it's doing. Now that size really matters with something like Kubernetes.So if you start with a container and you pull down an image, some of those images, even if you're using like the very slim things are still. 20 30, 40 megs. Now that's not a big deal, but there's some of these bigger ones, especially when people are doing, some of the bigger applications are close to a gig.Now that's not recommended. I know people say, but that's not a recommended practice. Yes. I know it's not. But in PRA and what actually happens in reality is people will do that. And so when you pull down, if you have a new version, even though they've cashed certain layers and things are the same, it's still takes a while to pull the new version and start it up.Whereas WASM is just these tiny modules. And even if we haven't done something super big, but I'm guessing that even if it's only, even if it's huge, it's only a few megs. And so then you're, you're pulling this down and this is very quick and very fast. And the added security benefits on top of that, where you're not having to deal with the same security layers as what is available inside of a container, that, that, yeah, very explicit grants on your security surface.Is a very powerful thing for us inside of Kubernetes on the server side.Jeremy: [00:18:29] And that security piece is a little interesting, is that when you're building the WebAssembly application that you're explicitly building in this application will have permissions to the file system at these paths, or kind of what does that look like?Taylor: [00:18:48] So it's not built in at compile time. Now there is one. when we get talking a little bit more about Krustlet, we have another implementation and there's one that, was started by capital one called wascc. which is WebAssembly, secure capabilities, connector. There's lots of W's here in this space.And so this, this wascc runtime actually does things called capabilities where you can, explicitly you're actually supposed to explicitly say, I need this capability and I'm signed to have this capability, but WASM by default and the WASI spec, you grant it at runtime. You say, I'm going to give you access to this file, or I'm going to give you the permission to do this thing. So those things are done at the very beginning of your runtime, not when you're compiling it.Jeremy: [00:19:33] Hmm. Interesting. And then, so that would be some kind of configuration file, maybe that would say, okay, when you run, this WASM application. Only give access to these permissions. And like you were saying before, that's much simpler than something like a Docker container where your permissions model is actually based off of a Linux operating system.Taylor: [00:19:57] Yes. It's much simpler in that sense. I mean, you're still, there's, there's more overhead involved if you were to spinning this up completely manually, you have to say, okay. I need to give access to this directory and I need to give access to this thing, but Kubernetes kind of takes care of that for you.And we do that inside of Krustlet with, with what we're doing to kind of abstract some of that away. So you don't have to do it. So, I mean, it's, it's like most security tradeoffs, right? Like the most secure thing is a server. That's not connected to the internet and is inside of a locked room, right? Like that's, and you can only access it in, in that room with a badge.Like that's, that's like the most secure you can get, but that computer is kind of useless. so that's the idea behind, the, the security trade off is now you're just being explicit about what you're granting instead of just having all these implicit things of, Oh, I can access the network and I can access this thing and I can access this thing, which comes by default as running a process on a, on an operating system.Jeremy: [00:20:50] When you were talking about being able to run the WASM application anywhere, that sounds like. benefit because normally when you have a build server, for, for example, an open source project, you'll see that they have to target all these different operating systems. They may have to build their application for, six or eight different OSS.And in the case of using WebAssembly, they could just build it once and it would technically run on any of those. It sounds like. We've been talking about WebAssembly on the server. before we get into how Krustlet runs WASM applications, if I had a WASM application and I just wanted to run it on my machine, what does that look like?Is there some kind of package manager or, how am I running these applications?Taylor: [00:21:43] So there's a couple of different ways that these can be shared around. There's no, specific. implementation right now. the thing that we started with, with inside of Krustlet itself, as we use the OCI specification, which is the exact same thing that the containers use, it's just storing as a different artifact type.Uh, there's some work by the people, working on WASC, they have something called gantry. and there's a couple other people who are looking into how you're supposed to store modules. So the idea behind all this, we have to figure out what exactly want to do. So it's kind of a little bit in the air. we have a tool that was built by somebody on the team called WASM to OCI, and it does the work of pushing that to a container registry that supports Arbitrary artifacts. And so what you can do is take that and pull that down and it just pulls down the compiled module file.It's always something.wasm, and then you can use whatever runtime you want to use to run it. So you can download wasmtime and install that you can down. You could do a there's Wasmer, there's Wasm3, which is for kind of optimized for embedded devices. Those are the kinds of things that, that you can do.Just, you have to choose the runtime that you want to use. And then you have to grab the module from somewhere, wherever that might be right now, which is still, like I said, a little bit of a loosey goosey kind of thing.Jeremy: [00:23:10] and, if I was just in the context of running an application on my computer without involving Kubernetes or anything like that, would it be where I get a single file? That's the WASM application. And then I pass that into say wasmtime or something like that, just at the command line. And that's how I would run it.Taylor: [00:23:32] Yeah, that's exactly how it would work locally. Now, obviously we're, we're working on building us to make it even more fully featured. The ideal would be that then you can have an application that can be easily swapped across machines. I mean, imagine if you had something like notepad. Or some sort of editor and then you could do it there and then just take that same application and then have it somewhere else.without worrying about what kind of system it is, you could have your raspberry PI 4 running a random desktop somewhere, and you could pass it over to a server and, do a virtual desktop session. Like you could do anything crazy that you wanted to by passing that around.Jeremy: [00:24:09] cool. Yeah. I mean, it's, it's this idea of having a truly universal binary, I guess, having the ability to copy this application anywhere and run it without having to worry about, I guess basically anything you just need to have a wise and runtime.Taylor: [00:24:26] Yeah. And that gives it some quite a bit of power. Right? Cause you could jump from an edge device. I'm using edge in the loosest possible term. Cause it could mean anything, but like any type of edge device, all the way to. I server I'm running in a data center. It can be anything along that whole spectrum of different servers that can run this, any can pass it around.And it can, you can even start to think about how you could maybe hot swap it, right? Like you could point at one running implementation of it. And then when you don't have access to that, because you lose internet, you could point it at a local instance of how to run this.Jeremy: [00:24:57] very cool. now maybe we should go a little bit more into Krustlet. I know you've talked a little bit about what it is, but maybe you could go a little more into detail about what Krustlet is.Taylor: [00:25:11] Yeah. So Krustlet stands for Kubernetes Rust kubelet. Lots of Ks. The main idea behind Krustlet was we wanted to create, we want it to reimplement the kubelet, which a kubelet is basically the, the binary that runs on a Kubernetes node that connects it to the cluster and runs it and joins. So that's, that's a kubelet and. we wanted to write it, write it so we could do WebAssembly. And now the reason we wrote it in rust was for a couple reasons. Number one, since you're probably listening to this because you like rust and you do rust, rust has probably the best WebAssembly support for, for server side things.most languages at this point actually have WebAssembly support, but it's mostly geared towards the browser, but, WASI is something you can easily add in. It's actually, you just use rustup and do rustup component or rustup target add wasm32 dash WASI. And that's how simple it is to start compiling for WASI compatible WASM binaries in rust.and so that is a, very powerful and useful tool to have around. If you can do it that easily. If you look at some of the other examples like C or C plus, plus you have to, customize clang properly or download the CA the already preconfigured tool chain and use that clang and then also you have a couple of other languages that support it, but rust was fully featured language that we could use it with, but also rust has caught our attention for a while, just because of its application inside of distributed systems and compute, like normally it's looked at as a systems engineering language, but can we use it in these cloud applications, this idea of cloud native as the buzz word right now? can it be used in a cloud native way? And so it was a secondary goal here was to prove that it could be, could be done like that.Plus you add on all of the safety and security and, correctness features inside of rust. And it really helps us out. So we wrote several blog posts about that, that I can probably send around or, or link around if you send out show notes. but the main idea is that pre, it has prevented us from shooting ourselves in the foot.Go has really easy concurrency things. There's sometimes that's one of the things that I most miss about using go for, for some of this is if I want to do concurrency. work. It's quite simple to set up, it's built into the language. It's very simple to do. But the thing is, is that there are bugs that we got caught where we'd sit there and be like, where everyone gets mad and like, why are you getting mad at me borrow checker, like, well, I've done everything. And then you're like, Oh, like it's caught that down the line. I'm going to, if I were to do this, I'd have two people trying to read the same data and that has been very useful and exciting for us because it's stopped us from doing those things. So the guarantee that when your rust code compiles, that it will be correct, even if it's not necessarily the right code, it's at least correct.And you're not going to have weird database access issues. Inside of, of your code. And so that was the secondary reason that we found it was, it's just very powerful for doing that. And it also it's, it's expressiveness with traits and how generics work gave us a good deal of flexibility inside of Kubernetes.having dealt with both extensively with both the. Kubernetes rust client and the Kubernetes go client, the Kubernetes rust client, it's much more ergonomic due to how traits work and generics. And so that was just something that we really enjoyed coming over. But it also, like I said, the main thing was that it had such good WASM support already built in and really most of the WASM runtimes are being written in rust right now.So it makes sense for us to be in this space. So that's why this project was created. was to do is to have the ability to easily run, rust things inside or easily run WASM inside of Kubernetes. So that's, why we used rust and why we came up with Krustlet.Jeremy: [00:29:07] Yeah, that makes a lot of sense because it had reminds me of, I had a conversation with a Armin Ronacher in an earlier episode and he works with century on different debugging tools and things like that. And the reason why they chose to use rust in parts of Sentry is because there are a lot of existing tools or a lot of existing crates in Rust related to compilers or related to, reading debug files and things like that. And so in your case, the, the community had already done a lot of work in WebAssembly, and that's why it made sense to choose rust. So I think it's interesting how there's these certain niches that people have built up and, continue to cultivate and continue to bring additional projects in because of that, that base it's, it's very cool. You had mentioned a little bit earlier about how rust has really good support for WebAssembly. And so it's easy to get something up and running in WebAssembly using rust. but you had also mentioned with WASI that there are only certain features of the system that are implemented.Like you had mentioned that there aren't network calls implemented yet. And I wonder when you're writing a rust application, do you have to do anything special when you're trying to target WASI, for example, if I were to make a network call in my rust application, and I try to compile it to be run in WebAssembly is the, the command line tool going to tell me you're trying to use an API that, that doesn't exist.Like how does that part work?Taylor: [00:30:52] that right now is a very rough edge. It just depends. I haven't tried to direct compile in with a network call yet, but. Most of the time, if you're doing something, that can't be compiled, the co the compiler will go in and say, like, I can't find a linked thing. Like when it's trying to link and do things, like, I can't find anything that links this together or that I'm able to compile this in and it'll spit out there.And sometimes it's very obtuse. It just depends on what it is. This is, like I said, it's a very rough edge right now. because it is so new, when, when you are compiling, but for the most part, you actually write things the same way. There's some things that you might need to pull in, to make sure that you don't, do something incorrectly or that you've had the correct things attached to the data structure or whatever it might be for the specific case.There's actually some good examples inside of like wasmtime and a couple other places. But for the most part, you write code pretty much how you just normally would.Jeremy: [00:31:45] so it sounds like currently, like you said, you pretty much write code as normal. You run it through the command line utility, and then you may get a helpful error message or you may get one that's really hard to decode and then you basically start digging around the internet, trying to find out, what might be wrong.Taylor: [00:32:05] Yeah, that's kind of how it goes. Now. Luckily, people are very responsive to this, in, in the community. but we're working with them to, to work around this and there are possibilities of using networking. So, the wascc examples. That's one of the reasons we chose to use wascc is because it has networking support built in.Now, the way it works is kind of just working around the problem. You have a capability that's built for the native system. And so if you're doing it on a, on a Mac there's, you can either load it from like a dylib file or like a, an object file of some kind, or you can have it compiled in, which is what we do in Krustlet.And so when we build it for windows, It gets that compiled part for windows when we build it for, for Mac, it's going to have that compiled thing in there. And so like it's just working around it by creating a component of the system that is then linked by wascc to be able to talk forward calls around.And so when something calls the networking thing, that call gets forwarded to the WebAssembly module, which handles it and passes it back out. There's not actual networking. Inside of the module itself. And so it works around it. It's a bit different of a model. It's an actual better model as opposed to, the wasmtime implementation and Crescent, which is more of a working like a standard container.I use that term very, very loosely. It's more like I have a process I'm running that process as, for as long as it wanted to. And then it's done as opposed to responding to specific like action or actor calls that come in from a, from a host.Jeremy: [00:33:38] so wasmtime and, and WASC are two, different WebAssembly run times. when you're talking about these processes running as actors, I guess, would that be. This would be a case where if you want to run a number of processes and you want them to communicate with one another, that's when you would choose wascc.I'm trying to understand when you would use one runtime versus the otherTaylor: [00:34:04] Yeah. So right now, like if you were to go download Krustlet right now and try it, basically you have, if you were going to be using Wasmtime, the only way to communicate is by like data on files. Cause it can access files. So you could mount a shared volume and then like pass it off so it could do data processing, but it can't do communication very well until we get the, until WASI finalizes, how it's going to do networking.if you want to do a full thing with WASC. So the thing with WASC is, you can use it entirely out of Kubernetes as its own thing. It's almost, it's, it's very similar to like a functions as a service kind of kind of tool. You write your things in whatever language, compile them to a WASM module. And then it handles connecting all those things together.And the capabilities model around it gives it a security, an additional security layer that you have to, things are signed. So all your modules have to be signed. And then you have to, say that it has access to specific capability. So whether it can access another capability that another WebAssembly module is exposing or whatever, you can glue all these together so that you could pass calls around.and it also has actually a, an ad hoc networking tool called lattice to connect multiple nodes together. So it can run entirely outside of Kubernetes. but it also has a bunch of tooling and things around it. So when you're going to do it, you're buying into that system and you have to know that that's what you're doing.Wasmtime sometime is meant to be, like I said, following that same WASI specifications. So as soon as WASI gets networking and then we'll implement the networking stuff, just because we want to make sure that we have the more, like here's the vanilla option, how you glue it together. If you want more of this like quick functions kind of thing, then wascc is an amazing tool.And so we implemented that because we've been, we've been collaborating with them for a long time. and so we've, we've had that implementation there so that it can be show another way wascc can be used while also helping people who want to try this right now and hack around to do things with networking.Jeremy: [00:36:00] Does that mean there is some specific API call, I guess, that you're using within your WASC actors that allows it to communicate with the other actors. but it's not like general network calls. It's, it's a very constrained API. Is, is that right?Taylor: [00:36:16] Yeah, there's an underlying thing called they call wapc yeah. WAPC so it's a WebAssembly protocol. I can't remember. Basically. It's a, it's the protobuf of the, this world. it's a message protocol. Here's how I'm sending a message back and forth.And so each actor is what it's called, is a WebAssembly module that can respond to specific events. And so it registers specific event handlers for those events. And when the, the underlying host that's running all these, gets it it dispatches those events to the actors and to, to run.Jeremy: [00:36:49] and so if you were using that with Krustlet then, the, the way that the actors communicate with one another or communicate with the host, that would be configured automatically by Kruslet I guess, or, or what is the, Taylor: [00:37:04] To an extent.Jeremy: [00:37:05] Okay.Taylor: [00:37:06] That's why I was saying that it can be used more fully featured outside of the Kubernetes world. but it has a distinct place inside of Krustlet as well. And so Krustlet will do some of the configuration to a point like it makes sure all the capabilities and stuff are configured, but we're, for example, we're still trying to define, well, if somebody wants to add other capabilities, How do they define that?Which normally you do, there's a freestyle block inside of a Kubernetes configuration called an annotation. So we're thinking, well, do we do it with an annotation or do we do it with another tool we don't know yet, but right now it does that base configuration link of it. It'll set up, make sure you have like an HTTP server access and that you have access to do logging and that you have access to do all those things.It sets up for you. But we have to still define a way of what's the, what's a safe way for a user to say, I need this capability.Jeremy: [00:37:56] When you've been developing Krustlet you've, you've mentioned how there's a lot of existing WebAssembly, capability in rust or projects, built in rust. are there other parts of the rust ecosystem or specific crates that, that helped you speed up your development a lot?Taylor: [00:38:15] Yeah, there are, in terms of development tools, I have really loved, Cargo expand when they're doing some of the macro stuff. Async things do a lot of like additional, macro things that like build stuff out or, or wrap things in. So it's kind of nice to see that. I also learned about a bunch of different tools around how to debug stack overflow's because we were accidentally pinning some stuff too early and it was causing a stack overflow that we didn't discover until windows. because there's a smaller stack and so that, like, there were some really interesting things on the nightly compiler, like how to print, type sizes, that was really helpful in identifying things that could have gone wrong. and looking at like our other tools we've really enjoyed, the Kubernetes crate for it. It's called uh Kube. And that one has some really awesome, awesome tools in it, that are, very helpful. I think a lot of them are things that people have heard of ser serde, or sir, I can never, I feel like that's Jeremy: [00:39:17] Right.Taylor: [00:39:17] that's a constant debate in the rust community.Um,Jeremy: [00:39:19] I, yeah, I thought it was Serde, but I don't know.Taylor: [00:39:23] And I think it's Serde because it's serialize, deserialize, but anyway, yeah, so that's there, that we've used a lot of, and has been very helpful. obviously the tokio runtime has been helpful for us as well. But really like, it just depends. Like if there's not like any other crazy tools we've used outside of it, I do have to say a huge, a huge thanks to those who are, those who were working on, like the Kube crate and, and some of these other things we've, we've been able to contribute back as we found bugs, and other stuff.I love the rust rust TLS crate as well, which has been very helpful for windows because then we don't have to have an open SSL dependency. and so that those are, those are the different things that have been, I guess, really helpful to us as I've like looked through and seen all the different things that we've done in our, in our code. Jeremy: [00:40:11] At the start of the episode, you mentioned you're relatively new to Rust, the languages you had used previously. would that be like go or what are some of the other languages you have a lot of experience with.Taylor: [00:40:23] Yeah, I am a, re stationed by way of go. I've kind of like moved all over the place. so I've done a lot. I mean, I've done a lot of Python, like for glue code when I've done some more SRE work. I did node, back when it was starting to become a thing. Right. And then moved on to go, and then we have.and then I've been doing some rust, since then, so yeah, it's just, I come from go, that's my main background before this was a lot of go because I was in the Kubernetes space, really heavily, well, I still am. And so that, that's where I got my go experience from. So that's my, that's my background where they're coming from go to rust is really the current thing that that's happened.Jeremy: [00:40:59] And since you, you have experienced with go, I'm wondering, are there things that you miss from go either in the language itself, the runtime, or even just the ecosystem that you wish that rust had?Taylor: [00:41:16] Yeah, there's a few things we've run into here. I overall, I am very pleased with, with, with the rust, and would choose it for a lot of different cloud projects, depending on what you're into. I think. Go, this is totally personal opinion here, but I think go is very well suited for smaller, like true microservices or anything.Similar, very, very small constraints tools, because it's quick to get started. It has such a constraint vocabulary. And one of the things I appreciate particularly, and some people hate this and some people like it, but I like that generally there is a way to do something in in go. There's sometimes two, sometimes three, but most times there is a way to do it.And when coming to Rust, there was 40 different ways to do the exact same thing. And so that was, that was something that I missed. The other thing I've said is that I really, I kind of mentioned, I said before that the concurrency story is much better in go and now it doesn't provide the same security or not security, correctness that.Rust provides, but it's so much easier to get started and to spit things back and forth. And it's just been, as a relatively new rustacean, I kind of get frustrated by the fact that there are two, three async implementations.Jeremy: [00:42:40] Tokyo, async standard.Taylor: [00:42:43] Yeah. Those are the two that I know. I think there was one more right?Jeremy: [00:42:46] I think there's uh smol, I think it's how you say it.Taylor: [00:42:49] Yeah, SmolJeremy: [00:42:50] I don't know.Taylor: [00:42:50] It's supposed to be like a tiny yeah. But yeah, it's so, and I know that's a common complaint and people have done a lot of work on those, but it's very frustrating because you like get bought into that specific implementation and it's like, let's choose one and make it part of the standard library, or make it the blessed crate or whatever it is just so we can all standardize and not have to like have weird hook ins to each run time and. all those those kinds of things, so that that's been fairly difficult to, to like work with sometimes, most of it's just otherwise like ergonomics.I know that, we're going to be writing a blog post on this soon, but inside Krustlet we were, doing a state machine graph. It was more, it's kind of like a cross between like a traditional state machine and uh walking, a graph, that we were doing to be able to. Encapsulate the logic inside of, inside of Krustlet a little bit better.Cause they were turning into monster functions that were just ridiculous. And because of that, we started to run into some of these things in go, you have the, you have their interfaces, right? And these interfaces, you can just keep calling through and chaining through, but because of how the, the type security that comes through, Through the trait system and things here in rust, it made a really difficult to find a way to just iterate.And we finally found a way around it. It was a little bit, a little bit clunky and well like I said, we're going to write a blog post on it. Kind of explain everything, but there's just some of those things where the boundaries rust puts on you make some things very difficult to figure out, to make it work in a way that's both rusty, but also readable, to someone trying to write it.that was one of those things. Like, I just see some of those rough edges sometimes that just, and I'm not sure if there's a way to solve that. That's that could just be me airing my grievances. But like, that's something that I miss. There's a little bit more flexibility that comes from the go model, with some of the things that we've, that we've run into here.But like I said, I, I feel that that's worth it, given the security and things that we've gotten in return.Jeremy: [00:44:49] so you mentioned one of the things with rust is that there's, there's so many different ways to do the same thing. And as you were learning the language, I wonder what was your approach to figuring out what is the ergonomic way to do things? And I guess just how to pick up rust in general.Taylor: [00:45:09] all right. It was a combination of Clippy. And I mean, everybody loves Clippy. Some people get really mad at Clippy, but like Clippy at least tells me like, Oh, Hey, like, you're right, but this, you could do this more efficiently or you could avoid an extra allocation or you could do all those, those kinds of things.learning to read the compiler messages. You get trained from like reading other compile. Like when other things fail, you look where the compile error failed, what line? And then you go there. Because normally it doesn't give you very much information, but with rust, you have to go through and read like, Oh, okay.It's telling me this went out of scope here or was passed, passed here, and you need to go do this, or you need to go do this, or you need to go do this. all those things can be, can be very useful when you're reading a rust comp, like rest compiler error message. So those are the other things.The other thing I had, and this is. I think one of the bottlenecks is that I had to go to an experienced rustacean who was working like a couple of experienced ones to go get the help. And there, that's what I love about the rust community. People are very willing to help. There's like the rust mentors page.I know was mentioned at rust conf recently that I hadn't heard about. but the big problem is that that's still kind of a bottleneck. Whereas with other languages I've been able to find like at least semi clear examples of. This is a good way to do it, or this is how, this is how these things are handled.I mean, an example of this is to string (to_string) versus to owned (to_owned) which technically they almost do the same thing when you're doing it with like, from, is there a proper name for ampersand string? Like just the, like the string slice, right. those things, the, to own versus to string.Really what it turned into is there used to be a difference, but then there was now it's just clear. Like I just need this as a string, then I use two string. But if I'm using it in a case where I, I have one, but I need a cop an owned copy then I used to own (to_owned) , even though they're I think they, at this point they call the same underlying logic.And so learning those kinds of things I had to learn from people I didn't there wasn't like something could say, Hey, like here's the history of this? Or. even in the documentation, which rust's documentation of the functions is really good. It doesn't say like, you should use this here or here there there's no specific suggestions. And so, if there's one thing that I hope we can improve is maybe how we can document like the intermediate level cause getting started there's stuff, but then otherwise you're kind of bottlenecks at specific, like asking other rustaceans, Hey, how do I do this thing?And that makes it a little bit harder for people.Jeremy: [00:47:40] I'm not sure what the, what the solution for that is. Whether that's, like you said some other intermediate guide or, but then again, it's kind of like, How do you, determine what to put in there and how do people get directed to that versus the relationship you're talking about, where you're, you're talking to experienced people and they just have all that context in their head and they can tell you, it's a hard problem to solve for sure.Taylor: [00:48:05] Well, yeah, and like I said, the compiler's fantastic. A lot of times you like learn from the compiler messages and that's how I finally learned that. Um when you give a static constraint on a trait that you're not saying it has to be static, it just has to fit into a static, which like, I was like mind blown moment right there.And I was like, Oh, that's cool. Okay. I get it now. But before I'm like, I don't want to make this static. That's going to like bloat the size of this. And I like, I can't have this allocated on a stack. Like, this is huge, but it's like, no, It's just saying this needs to fit in a static. And I'm like, okay. And I learned that I think from a compiler, I could be wrong, but I think that the compiler said this doesn't fit in something or whatever.And I'm like, wait a second. And I, and I dug into it and found it. So I think that a lot of the work that people discussed this at rustconf about making the compiler, just kind of guide you through things and anticipate it is quite impressive. And I'm very, very happy with that. So maybe that is that intermediate way that eventually get there.But I'm just saying that I think that the there's some tooling there, but when people get in that, that learning curve is just a little bit, I like to describe it as logarithmic, right? Like it's just, you have that initial punch over the top.And then once you understand that you can be quite capable of producing things quickly. Jeremy: [00:49:17] You were talking about some of the. Issues you ran into or roadblocks where you needed to get more intermediate help or talk to people more experienced. I wonder when you first started, how did you first start learning Rust?Did you go through the, the breast programming book? Did you start making little small projects? I'm curious how you approach that.Taylor: [00:49:42] Yeah, I did a combination of both. I tried to do some of the like rest by example, the rust book, just some of the basics. And then I tried implementing, I tried reimplementing some parts of, of, different projects that had in the past or things I just wanted to try.And then I went on with trying to like actually implement in a real project, which was Krustlet. And you can see if you actually look at the Krustlet code, you can see where like, Oh, they must have been new there and we've been going through and cleaning that up as we go along. But having that real project and something I could like go towards, that's how I learn best.It's just having like an actual goal of either reimplementing something or, Or, or finding like an actual project to go, like dig into. And that's how, that's how I work, but it's a little bit different for every personJeremy: [00:50:33] Was there a moment, I guess, where you felt like you were really struggling and then once you pass this point that, rust clicked for you or was it, did it feel pretty straightforward as you were going through the process?Taylor: [00:50:47] it was a little bit gradual. more than like a specific moment where I was like, Oh, everything clicked. I do have those moments. Like I mentioned before, like when I understood what, like a static constraint on a trait means, like that was like, Oh, like mind blown. I finally get it. but it was more of a once I started like actually doing like more complex traits or trait bounds.I think what I finally felt I was getting it was when I could do something like that. Where T equals this plus or T this plus this and, N is this plus this, like with the like long constraints at the end of a function and like understood what it was doing and why I did it that way. And I think also a combination of, of implementing some of the traits, like as ref, as mut ref those kinds of things that I could pull out or like convert, or have a wrapper type.And I'm like, okay, I'm finally getting how all this glues together. that's when I that's, when I noticed, I think, okay, like I can, I can do this now. Like I can put together some, some cool things. but yeah, it comes, each thing comes with its own accomplishments. Like I had mentioned before, we had the state machine that we've just finished and we're cleaning up right now.And that, that state machine thing was like, okay, like we finally got something that worked. I think it's recently where I felt like, okay, I feel like I can actually be a good contributor to the community and maybe even start mentoring others properly because I have the knowledge to do it because now, now we create something new that as far as we can tell, like, people haven't done something like this in rust before, outside of just toy things.And so like, we're, that's why I'm excited. I wish we had had like that today. So I could say, Oh, here's this blog post, but I'm really excited for that, because we're going to talk about like all the work we built on from people in the community who had posted about it and all these things in it. Okay. We managed to get something that works.It's still like has rough edges. It still has things, but we've got something that works well for us. And so I, I that's, that's been fairly recent. And so I think there's just moments where like, I keep understanding more, but for me it was more of a gradual turning of like, Oh, and then all of a sudden I realized like, Oh no, I think I've gotten to the point where I know it, it wasn't like a click. Like I know what it is. Oh, I just realized I actually know what I'm doing now.Jeremy: [00:52:56] Yeah. Yeah, no, that, that makes sense. You've been mentioning how Krustlet, has some rough edges.And I know on the project page, it mentions how it's highly experimental. What do you think are the, the big parts that are, that are currently missing for somebody who would want to go in and actually host their application using Krustlet?Taylor: [00:53:16] well, one of them is completely outside of working ends, which is networking, in Waze. we're, we're trying to work with the community to do that. And we're going to see if there's a way we can only solidify and jump in and work on that. but we're getting close. So, this is no one can hold us to this, but we're hoping to get towards a 1.0 Release towards the end of the year, beginning of next year.and so like around the holiday season, things will slow down, whatever. So that's why we don't know it could be January, February, and that's what we're hoping for. And really the big things that we have are, we have basic volume support. but we don't have cloud volumes support. So we're going to be looking in a way of how we can make sure every provider can, can use this.We're trying to solidify the API at the same time with that, because we have people who are writing other providers and a provider is just something that is an implementation of a runtime. And so we've written ours in for WASM, but we have another person who's a core maintainers, his name's Kevin.and he's been, recently made a core maintainer of the project as well. And he's working on one for containers. so he's just moving the container implementation stuff over to rust because of all the security benefits and things. And so we need volume support for all the providers, and then we're going to try, and then we're going to figure out a way we can abstract the networking, probably using the same interfaces that Kubernetes has already defined, so that we can have networking implementations more, more readily available and connected into these things from, the rest of the Kubernetes world.And then after that, we need to have like a real demos like we have demos, but we need like some, like, I want, like, here's a real application as far in so far as you can make it real and take that and put together, some bootstrapping things. So it's easier for people to set it up. We want to make it as one click as possible to set it up.And so. Those are kind of the things we're looking at and that people can, can look for rough edges. But if you want it for it, for example, if you want it to trigger like data pipeline processing, you can use just wascc or you can glue together wascc and a WASI provider, which is the wasmtime one. You can glue both of those together or have two of them running and you can trigger a data chain using an HTTP call and then process the data using one.You can do that right now. we had, an intern on our team over the summer and she did some work with, on raspberry pies using a raspberry PI Kubernetes cluster, and then using Krustlet to read soil sensors. You can do some fun things with it. It's just not all the way there. And that's partially because of where things like this is, this is bleeding edge, and that's why we put like the big warning sign on the reading.Like, like, please don't run this in production. Like you're you're this is so new. Not just the project itself, but also the technology around it. And so that's what those are kind of like the steps we have. So, at this point I would say, I maybe would remove the highly from Krustlet's description Hm I, if I was really wanting to, because now it's just, this is an experimental project.That's kind of the goal for the future here, but we're not that far out from having something more it's a 1.0 where people can actually people can start using it in a real way and maybe not perfectly, but in a real way.Jeremy: [00:56:27] Last year, Solomon, the CTO of Docker, he had tweeted that if WASI and WASM had existed in 2008, then they wouldn't have needed to create Docker. And I'm wondering from your perspective, thinking about the future of, WASM and Krustlet, do you think that.Running applications in WASM could become the default for, for server side applications in the future.Taylor: [00:56:59] It's a possibility. I wouldn't peg it entirely for sure. I, I think it will be a mix. People are just like getting on board with Kubernetes stuff and containers, which sometimes like, like I said, I think some people go way too. Far into it and don't think they just like, Oh, they hear Kubernetes and buzzword and want to do it.But people are still just barely getting to that thing. So it'll be a long time if it does become the default. But I think it has the ability to, reach a very specific audience right now and have that grow. a lot of these constrained environments, like edge computing, you can't run Docker on there it's too much, too much overhead.They just can't handle it. But you could run, WASM modules. it also allows you to pack things in more tightly because if everything's a small process, that's just this tiny little thing and tiny little binaries. You can run that with. we can run a lot more than you could with containers right now.So I wouldn't say that it's going to, it's not a container killer, nor is that our current goal. Like we didn't, we didn't think that, like, we don't want to disparage that other technology. That's, that's something we still use a lot and effectively. but I, I do think that there is going to be some takeover of that space, at least in a small measure with all this stuff from wasm and WASI.Because of its just portability and the ideal of we'd be closer to a write, write once compile, once and run anywhere kind of situation, people say it's a pipe dream. I think we'll never get completely there. Even with WASM, there's still constraints. There's still things that will be in place, but this makes it easier.And having, if, if every language gets to the point where you can do it with rust, where you just say, here's my Wasi target and build it. That to me sounds like a very powerful way of doing it and not just for server side. I think that can reinvent a lot of things with normal applications as well. Just because of how portable they are and how then, applications could be tied to you instead of just like being tied to a computer or whatever it might be.So there's some really interesting ideas here. It's just. There's those, those ones are a little bit further out, but I do think even in the short term, we'll start seeing some good applications where WASI will be WASI, compatible WASM, binaries will be a better choice than using a container.Jeremy: [00:59:17] and it sounds like maybe in the short term you were talking about edge computing and that might be something like where you have a CDN, running application code. At their, their edge nodes, something like Cloudflare's workers or Fastly has an equivalent. are you thinking that might be where, where these things start?Taylor: [00:59:39] I think that's where it's already started in one sense. And that's one of the reasons we chose Kubernetes is we think there's more beyond Kubernetes and server side on my team. That's that's our belief that we believe there's more to that, but everybody is getting into trying to do Kubernetes and have this Kubernetes has become an API layer that people understand that a lot of people use and enabling WASM through that API gives people a reason.We want people to start using this and say, Hey, like, why doesn't my insert language of choice? Have the support for WASM. I like WASI binaries. Can we please get that? And then as people do that, we start getting more motion around it.Jeremy: [01:00:19] I know when. I talk to people about WebAssembly a, sometimes what I'll hear is they'll say, well, I'm happy writing JavaScript in the browser. Why do I care about WebAssembly right. And so, like you say, if there are more use cases for running other than, just in the browser, then that might inspire other languages like Python or Ruby, or who knows what other languages too, to focus on getting them to work on WebAssembly. So I think that's, that's pretty exciting. so I think that's a good, good place to start wrapping up. if people want to learn more about Krustlet or, about what you're working on, where should they head?Taylor: [01:01:00] I would definitely start with, the actual project site. So that's deislabs/krustlet on, github. there is also some, some posts that we have in various places. I wish there was like one amalgamation of all of this, but, you can look at the, it's deislabs.io/posts. I believe.Let me just double check that.Jeremy: [01:01:23] and we can probably get the krustlets specific, posts and then put those in the show notes as well.Taylor: [01:01:28] Yeah, and I can send those, but yeah, it's deislabs.io/posts. that's posts for all of our projects, but you'll see some, at least three blog posts there about Krustlet. one was around our, our stack and heap allocation problems that we had and some lessons learned there. so those are, those are some other places you can go.We have some other posts that hopefully we can send in the show notes that kind of give an overview of the different things that the reasoning behind this, if you're more interested at this from a high level, like a business perspective, we have a post for that we have some other things about the security things we got from it that we've posted around.So, there's, there's lots of sources of information there, but if you really want to get started and look at the project and install it and try it out, go ahead and check out. deislabs/krustlet on get hub. And that one will have the docs and everything you need to get started.Jeremy: [01:02:19] Very cool. Taylor, thank you so much for talking to me today. It's been interesting learning about Krustlet and WASM. Kubernetes and all of that. And I think it's going to be very interesting to see where it goes in the future.Taylor: [01:02:33] Well, thank you very much for having me. And hopefully everyone finds at least some of this interesting and useful.
undefined
Sep 9, 2020 • 1h 9min

Building the Lucky Web Framework in Crystal with Paul Smith

Paul Smith is a Software Engineer at GitHub and the creator of the Lucky web framework. He previously worked at heroku and thoughtbot and has experience building applications using Rails and Phoenix. He's also the creator of the Bamboo e-mail package and the co-creator of the ExMachina test data package for Elixir.We discuss:The tradeoffs of object oriented and functional programmingHow a lack of compile time guarantees slow down ruby and elixir developmentCreating conversational error messagesWays fast languages can change how you write applicationsWriting templates with Crystal instead of HTMLChoosing what to include in a web frameworkThe Crystal community and ecosystemRelated Links:@paulcsmithThe Crystal programming languageThe Lucky web frameworkHTML2LuckyAlpineJSThis episode originally aired on Software Engineering Radio.Transcript:You can help edit this transcript on GitHub.Jeremy: Today I'm talking with Paul Smith.Paul is the creator of the lucky web framework and he currently works at GitHub. Today, we're going to talk about the crystal programming language and the lucky web framework. Paul, welcome to software engineering radio. Paul: Thank you so much. Happy to be here.Jeremy: There are a lot of languages for software developers to choose from. What excited you about crystal? Paul: Yeah, that's really interesting because when I first saw Crystal, I actually was not interested at all. it basically looked like Ruby to me. And so I just think, okay, so it's a faster Ruby. And typically if I want to learn a new language and want something that feels really different, that pushes the boundaries on things.I started getting more interested in compile time guarantees. I worked at thoughtbot previous to github and previous to Heroku and people were starting to get really into typed languages. Um, some people were starting to get into Haskell, which is like, you know, the, the big one that, I guess is probably one of the more type safe, but also hard to use languages.Um, but also Elm, which has a good focus on developer happiness and productivity and explaining what's going on. And as they were talking about, how they were writing fewer tests and it was easier to refactor, uh, it started becoming clear to me that that's something I want. Um, one of the things somebody said was, if the computer can check the code for you let the computer do that rather than you, or rather than a test. so I started to get really interested in that. I was also interested in elixir, um, which is another fantastic language. I did a lot of work with elixir. I built a library called bamboo, which is an email library. And another called ex machina, which is what a lot of people use for creating test data. Um, so I was really into it for awhile.And at first I'm like, wow, I love functional. And then I realized like. I can do a lot of, like a lot of the stuff I like about this I can do with objects. I just need to rethink things so that it uses objects rather than whatever random DSLJeremy: Cause I mean, when you think about functions, right? Like you've got this big bucket of functions and you got to pass in all the parameters right? Whereas, you know, in a lot of cases, I feel like if you have those instance variables available in the object, then the actual functions can be a lot simpler in some ways.Yeah.Paul: Totally. That's like a huge focus and making the object small so that it. It doesn't have too much, but that's how I began to feel with elixir is that I'm like, I just have 50 args and most of them I don't care about. Like I want to look at what's important to this method, to this method.It's, you know, this argument, but with functions you're like, which things important. Is the first thing? Probably not. That's probably just the thing I'm passing everywhere. And so I liked that ability to kind of focus in and know like, this object has these two instance variables everywhere.Jeremy: Yeah. It's kind of interesting to get your perspective because, it seemed like you were pretty deep into elixir if you had created, bamboo and ex machina and stuff like that, so it's kind ofPaul: Yeah. I was like way gung ho and, and then I started missing objects. And luckily with crystal and ruby, you still get a lot of the functional stuff. Like you can pass blocks around. Um, that's functions. You can use functions. But it's not the other way in Elixir, you can't use objects. It just doesn't exist.And then the type safety. I'm just like, I still run into so many errors and it was so frustrating. I don't want to do that.The main benefit I got out of elixir compared to rails, um, which is what I had been using and still use a lot of, was speed. That was really big. Um, in terms of bugs caught about the same, mostly because it's still for the most part dynamically typed language with very few compile time guarantees. Um, so I'd still get the nil errors. I'd still mess up calls to different functions and things like that. And so that's where I ran into crystal. It has the nice syntax. I like from elixir and Ruby. It's also very, very fast. Faster than go in some benchmarks.So it's quick. Plenty fast for what I need. And it has those compile time guarantees, like checking for nils. That's a huge one. and it also makes the type system very friendly. So it does a lot of type inference. And very powerful macros so you can reduce some of the boiler plate.And so that's when I kind of started getting into crystal was seeing Elixir I still got a lot of these bugs that I was running into with rails, but I liked the speed but I don't want to use Haskell and Elm doesn't exist on the backend. so I started looking at crystal.Jeremy: And so it sort of sounds like there's this spectrum, right? You have Ruby and you have, elixir, where you don't necessarily specify your types so the compiler can't help you as much. And then you've got Haskell, which is very strict, right? You have a compiler that helps you a lot. Um, and then there's kind of languages inbetween Like. For example, Java and C and things like that. They've been around for quite some time. how does crystal sort of compare to languages like those ? Paul: Yeah, that's a great question cause I did look at some of those other ones. TypeScript for examples is huge. Kotlin was another one that I had looked at because it's Java but better basically. That's the way it's pitched. And so far everyone that's used it has basically said that. And also looking at rust, what it came down to was how powerful was the type system. So crystal has union types, which can be extremely helpful, um, and it catches nil. Java does not have a good way to do that. Um, Kotlin does. But also boiler plate and the macro system crystal's is extremely powerful. Elixir also has a very powerful macro system.But crystal's is type safe, which is even more fantastic. So basically what that let me do with lucky, it was build even more powerful type safe programs. And we can kind of get into that once we, we talk about lucky and how that was designed. Um, but basically with these other languages, a lot of what we do in lucky just simply wouldn't be possible or wouldn't be possible without a significant amount of work and duplication.Jeremy: You covered a few things there. One of the things was, macros, what are are macros? Paul: Yeah. This is like a confusing thing. It took me a while to, to get, um, what it is. But, uh, in Ruby, for example, they have ways of, of metaprogramming. That are not done at compile time for most compile time languages, compiled languages, I should say. You need macros to de-duplicate thing, and basically what a macro does is it generates code for you.The way I think about it is basically you've got a method or a macro, but it looks like a method. It has code inside of it. And it's like you're copy pasting, whatever's inside of that macro into wherever you called it from. So in other words, rails has a, has many, like has many users, has many tasks that's generating a ton of code for you.So that's how Ruby does it. Um, and crystal has many would be a macro and it would literally generate a ton of code. And copy paste that into wherever you called it. Um, so it's just a way to reduce boilerplate.Jeremy: So in the case of dynamic languages, like Ruby, when you talk about Metaprogramming, that's having I guess, a function that is generating code at runtime, right? And the macro is sort of doing something similar except it's generating that code at compile time. Is that kind of the distinction? Paul: That's the way I look at it. there are people much smarter than me that probably have a more specific answer about what the differences are, but in my mind and in practical usage, that's what it comes down to in my mind.Jeremy: Let's say there's a problem in that code, what do you get shown in the debugger? Paul: Debugging macros is definitely harder than debugging your regular code for that exact reason. it is generating a code. So what crystal does, uh, there's different ways of doing this, but I like Crystal's approach. It'll show you the final result of the code and it'll point to the line in the generated code that caused the issue and tell you which macro generated it. Now, it's still not ideal because that code isn't code you wrote, it's code that the macro generated, but it does allow you to see what the macro generated and why it might be an issue.Part of that can be solved by writing error messages and error handling as part of the macro. So, in other words, making sure, if you're expecting a string literal, you can have a check at the top that checks for it to be a string literal. I wouldn't use them by default, but it's great for, I think a framework where you have a lot of boiler platey things that you're literally typing in every single model or every single controller, and that people kind of get used to. It's well tested. It has nice error messages. In my own personal code though, I pretty much never used macros. They're only in the libraries that I write.Jeremy: Another thing you mentioned is how crystal helps you detect Nils or nulls. Um, how does, how does the language do that?Paul: It actually uses union types for that, some languages that have this, they'll have an optional type, which is basically a wrapper around whatever real type, like an optional string, optional int, and you have to unwrap it. The way crystal does it is you would say string or nil, and there's a little bit of syntactic sugar.So you can just say string with a question mark at the end. But that gets expanded to string or a nil type. Um, so then within that method, the compiler knows that this could be a string, could be a nil, and there's a little bit of sugar there where the compiler, if you say, if whatever variable you have, it's going to know that within that, if it is not nil and in the else it is.So there's a little bit of sugar there as well. Um, but that's basically how they handle it. And there are ways to force the compiler, uh, just say, Hey, this thing is not nil you can call not nil on it. That's a little, I would avoid that because maybe the compiler's right. And it really is nil. Or maybe you change the method later and then it can become nil and you're going to get a runtime error there.But it does have those escape hatches. Cause sometimes you just need the quick and dirty and you can, if you need to.Jeremy: As long as you don't tell the compiler that, then you will actually have a compiler error. If you have a method that takes in, let's say some type of object or a, a, nil. And then you don't account for the fact that like it could be nil. Then the compiler actually won't let you compile, is that correct?Paul: That is correct. So for example, if you just had a method that's like, print. email and it accepts a user or nil, now, I'm not saying I would do that, but let's say that it does. And you just tried within that method to do user.email to print the user's email. Um, it's going to fail and tell you that nil does not have the method, email.And so you need to handle that. And then, yeah, you're forced to either do an if, or for example, you can use try, which is basically a method that says call call a method on this object. Unless it's nil, if it's nil, just return nil. But yes, it kind of forces you to do that.Jeremy: And in crystal, how do you handle errors? Because a lot of different languages, they'll have things like exceptions or they may have result types. What's sort of the the main way in crystal?Paul: I'd say I'd group it into two types of errors where. You have runtime exceptions still because things do break. Not everything is in a perfect world. Inside your type system, databases go down, you know, redis falls over or whatever. So you still have runtime exceptions and then you have the compile time errors, which we kind of just talked about.But in terms of how those runtime exceptions are handled it's I don't want to say exactly the same as Ruby, cause there probably are some subtle differences, but extremely similar to Ruby and that you're not passing around errors. It's so, it's not like go where you are explicitly handling errors at every step.Um, you raise it and you can rescue that error kind of like a try catch in other languages and you can also just let it bubble up and rescue at a higher level, which I personally prefer. Because not every air is something that I care about and kind of forcing me to handle every single error everywhere means that it is harder as a reader of the code to tell which errors I should care about because they're all treated as equal.So I like that in crystal, I can say this particular error, this particular method I want to handle in a special way. And somewhere up above the stack. I can just say anything else. Just print a 500 log it, send it to Sentry.Jeremy: Yeah, so it's very similar to, like you said, Ruby, or any other language that primarily relies on exceptions. Like I think Java for example, probably falls into the same category.Paul: probably. I haven't used it in quite some time, but I imagine it would be similar.Jeremy: You had mentioned that that crystal is like pretty, pretty fast compared to other languages. what are the big. benefits you've gotten from that raw speed? Paul: The biggest benefit I would say is not having to worry so much about rendering times, and rails for example. You can spend a ton of time in the view, even though everyone says databases are slow, they're not that slow in something like rails active record takes a huge amount of time to instantiate every single record.So how does this play out in real life? You could, for example, in lucky if you wanted to load a thousand records and print them on the page and probably do that in. a couple hundred milliseconds maybe, which is a totally reasonable response time. Same thing in rails would be many seconds, which is not reasonable in my opinion.And this can be really helpful, partly because it just means your apps are faster, people are getting the response as quickly. But also because you have a lot more flexibility. I've built internal tools where they want to have the ability to search all of the inventory or products or whatever else and they want to have like a select all or be able to select everything.And in rails, you can't just render all 1000 products cause it basically falls over and you can try and cache stuff. But then that gets complicated. Um, so you kind of have to paginate. But when you paginate that makes it hard to select things across multiple pages, it's then you need some kind of JavaScript to remember which ones you selected across pages, and it just balloons the complexity, right?If you know, Hey, we only have eight or 900 products, we're not going to suddenly have 20,000 in lucky. You just render them all, put them all on the same page, give them all check boxes, and it's. In the user's hands in 200 milliseconds and you're done. You just removed most of that complexity. So those are some of the ways that that speed is playing out. And I think one key difference there is some people think speed is just about scalability. How many people can be using this? The speed improvements I care about are the ones where even if you have one request per day, I want that request to be insanely fast. and so that's kind of what you're getting with lucky and crystal.Jeremy: When you talk about web applications, you know, with lucky being a web. Framework. A lot of people point out that a lot of the work being done is IO, right? It's talking to the database, it's making network calls. But I guess you're saying that rendering that template, those are things that actually having a fast language, it really does make a big difference.Paul: It does. Yeah. I, I think the whole database IO thing, a lot of times that's what people say when they're working with a slow language. If you have a fast one. It's not as big of a deal. Cause this was the same with Phoenix and Elixir. Like I loved, how quickly it could render HTML. That was huge.Jeremy: And like you said, that opens up options in terms of, not having to rely on caching or pagination or things like that.Paul: Yeah. This is huge. I mean, an example from work. We just announced github discussions. Um, and I'm on that team. And one of the big things we were trying to get working was, was performance of the discussions show page. You could have hundreds of comments on that page. And we were finding that most of the time taken was actually spent rendering the views and calling methods on the different objects to render things differently in the seconds. And we can't cache those reliably because there are so many different ways to show that data. If you're a moderator, you get certain buttons. If you're an unverified user, like someone who just signed up, you see a different thing. If you're not signed in and you see a different thing, and so you can't reliably cache those, and we had a lot of cool techniques to kind of get that down, but this is something that if this were written in lucky, it just would not have been an issue.Jeremy: And github in particular is written in Ruby, is that correct?Paul: It is. Yeah. It's using Ruby on rails, and I'm not trying to knock rails. I, I really love rails. I mean, I've been using it for 12 years. Um, I like Ruby. Uh, but Hey, if there's something that could be even better, I'm open to that.Jeremy: For sure. You have used Rails for 12 years. how would you say that your productivity compares in Ruby versus in crystal?Paul: I think that's tricky. It's kind of better and worse. And what I mean by that is. I think crystal, I am. I'm more productive. And crystal, you do have compile times and we can talk about that. They're not the fastest, they're not the slowest, but I do find that I can write more code and then compile once, and it kind of just tells me where the problems are and I have a lot more confidence and I spend a lot less time banging my head on like, why isn't this thing working?And it's because I passed the wrong type somewhere. however, Ruby has a massive ecosystem, so there are things that exist in Ruby that I would have to rewrite and crystal. and so that for sure, no matter how productive I am in crystal, is not as productive as requiring the gem and then just using it.So the hope with lucky though, is that we're building up enough things that. You don't have to be rewriting everything. And the community is also really stepped up and writing a number of, libraries that are super helpful for web development. Um, for example, somebody just wrote web drivers.cr, which makes it so that it can automatically install the version of Chrome driver that matches the version of Chrome that you have installed.So you don't have to manage that at all. That's something that was in Ruby for awhile, and will be in lucky, probably in the next release. So yeah, I think it's better. It's one of those things that will get better with time.Jeremy: So in terms of the actual language, productivity, crystal, it sounds like basically a net positive, but it's more in the the community aspect and how many libraries are available. that's where a lot more time, but it's taken.Paul: I think so. And then just the initial ramping up, uh, it is a new language and so there aren't as many stack overflow questions and answers and there aren't as many tutorials. So there's definitely some things there. But like I said, those are things we're working on, especially for one out of lucky. Try and make sure we have really good guides, uh, really good error messages.We tried to borrow a little bit from Elm. Not specific error messages, but just the idea that an error message should raise something human readable and understandable, and if possible, help guide them in the right direction of what they probably want to do, or at least point them to documentation to make it easier.So we're trying to help with that as much as, as we can.Jeremy: I kind of want to move into next more into your experience. building lucky. you know, you were a rails developer for many years, and are there any like specific major pain points, I guess, in rails or in your previous web development experience that you wanted to address with lucky?Paul: Yeah. There were, um, some more specific than others. Um, some easier to solve. In the sense that the solution is like it works or it doesn't. And others that are a little bit more abstract. So I'll talk about some of the specific things. I often said that I'm into type safety. I don't think that is quite true, and I think it. Especially if you haven't used lucky, it just doesn't click what that means or why it matters. Cause you just think like, Oh, so you know, don't tell me if I pass an integer instead of a string. Like who cares? I'm not seeing those kinds of errors.What I'm most interested in is compile time guarantees, whether that's with a type or some other mechanism. and that's there, not just to prevent bugs, but to help you as a developer to spot problems right away. And give you a nice error so you know what to do about it. So, for example, one of the things that I've seen in basically every framework I've ever used, regardless of whether it is type safe or not, is that you need to use an HTTP method, a verb and a path.So, for example, if you want to delete a user, you would have forward slash users forward slash one to be the ID. The tricky part is you have to have the HTTP method delete for it to do the delete action. But sometimes you forget that you use a regular link and you wonder why the heck it just keeps showing you this thing instead of deleting it or the particularly insidious one is when you have a update and a create. One uses post one uses put, if you have an update form and you forget to put the method put, you get all kinds of routing errors cause it says, Hey, this doesn't exist. And you went, well why? Why doesn't this exist? I can see it right here. I've got the route, I've got everything.Oh it's cause I forgot to put the HTTP method is a PUT. And it just waste time. So that's one of those things where we wanted to compile time guarantee and lucky. And so I don't want to go too in depth here, but basically what we did was we made every controller into a single class that handled the routing and also the response.Jeremy: If I understand correctly, when you have a page. And you want to link to a specific user, on that page. Then you would use this function link to, and you would pass in the class that corresponds to showing a user, and then you would pass parameters into that function. Like, for example, the id of the user.And if you didn't do that. Then you would have an error at compile time, notPaul: correct.Jeremy: you. You wouldn't need to like start the website and then go to the page and have it, basically explode, which I guess is typically what you would expect from most web frameworks.Paul: Or what's worse, it wouldn't explode. It would just generate the wrong link and you would have to remember to click that link or write an automated test that clicks that link. And so it's really easy for bugs to sneak in, and this just completely prevents that class of bug. As well as just makes life easier because if you forget a parameter while you're developing from the start, instead of just generating something with like a nil ID, it's going to say, Hey, you forgot this.It just saves a lot of debugging time, and I think it's also more intuitive if you've ever used rails helpers or Phoenix, help any of these man the conventions. Like it's a singular, isn't plural, is it? Does it have the namespaces and not have the namespace in lucky that it's gone. You just call the action, the one that you created, you call that exactly as is.Jeremy: It sounds like this is maybe a little more explicit, I guess? Paul: Yeah, it's a little more explicit, but I hesitate. I've heard a couple of things in the programming community. Um, one, the rails started as convention over configuration, which that was huge because you had to learn the convention, but at least once you did, you knew how about other rails projects were. And then another one I hear is explicit over implicit.I don't buy into either of those in particular. Um, because sometimes implicit is better, sometimes explicits better. I mean, for example, it was a quick example. I don't hear anyone arguing to bring back the old objective C where you had to manually reference and dereference memory that is technically more explicit.But does anyone want to do that? No. So I don't think explicit over implicit, you have to think about it. Everything needs to be judged, in its own context. And what I think is even better than convention over configuration is intuitive over inventions. Meaning you don't even think about it.You don't even need, there doesn't need to be a convention. Because you're literally just calling the thing that you created like anything else, there's nothing special about that. It's a class just like any other class and you call a method on it, just like any other method.I think it's tricky because I think it's also easy to say explicit over implicit and make your code super hard to follow. And it's like, yes, it's more explicit, but also I just wrote 20 lines of code instead of one. And those 20 lines could differ because I do it differently than the other guy or girl.Jeremy: Another thing about lucky that's a little different is that for templating, instead of having somebody write HTML and embedding language code in it, uh, you instead have people write crystal code.So could you kind of explain sort of why you made that decision and what the benefits are.Paul: Yeah, sure. So a lot of things actually with, lucky. Kind of I did not want to do, or were definitely not how I started doing things. And it just kind of moved in that direction. based on the goals. And I think that's part of what makes lucky, different is that we don't say, here's how I want to do it.We say, here's what I want to do and I want it to be easy, simple, and bug free. So. What we started with was using templating languages, just like you'd use in almost any, anything where you write your HTML and then you interpolate values in it. At the time I wrote lucky, and this may be changed now. you could not use a method that accepted a function or a block is what it would be called and crystal, and have that output correctly in the template. I think it just blew up. I don't remember, this was two years ago, three years ago. The other problem I was having was, it's not just a template. Any bigger size framework also has partials or you know, fragments or includes or whatever you want to call it. It also has layouts where you can inject different HTML in different parts of your HTML layout, and those are all things that a person has to learn when they're learning your framework. What are these methods called for.Generating a partial for calling a partial or injecting stuff in different layers of the layout. And it's also more stuff that I have to write. And with lucky, like there was already a lot to write. They were building the ORM and the automated test drivers and the router and like everything. So I can't afford to just do stuff like everyone else does it if it's not pulling its weight.So eventually. I started experimenting with building HTML, using classes and regular Crystal methods. Some of the requirements, um, for me when I was building it was it had to match the structure of HTML and it had to be very easy to refactor. Meaning I can pull something out into a new method and it just works.So easy refactoring. And then I also need to be able to do layouts with it. The reason for that is Elm also uses, code to generate HTML. However, it is not approachable to a newcomer. if for example, you have a designer and they pull up in an and try and look at what that, what that generates.No way. I mean. I'm a programmer, I still don't know what it generates without really looking through Elm. And that's partly because you are generating data objects. So arrays of arrays. Or maps or whatever else. so I didn't want that. It has to be approachable to people and look and be structured like HTML.And so we were actually able to do that. I don't know if I need to go into huge detail, but basically you can say, Hey, I want to div. Inside of that, I want an H1 underneath that. I want another div. And you're not building arrays and maps and anything else. What that provides is actually a lot of things that I did not think of.One super easy refactoring. If you have a link in a particular page and you don't want to copy that over and over and over, extract a method and you call it like any other method, there's nothing to learn. It's just a method. Like anything else, it can accept arguments just like anything else. Your conditionals work.Um, you can extract that into a component, which is basically another class and it tells you explicitly here's what I need to run. And it renders the thing. Um, you always have the correct closing tag. I have been bitten so many times by shifting stuff around. And forgetting a closing tag and my whole page looks wonky and I have to go through layers of indentation.That just doesn't happen if you forget an end so you would have a do end when you're creating these blocks, it blows up. It's like, Hey, you're missing one. And the coolest part is you just add an end in there and you've run the crystal formatter and it re indents everything perfectly. And then on top of that, it's, if that wasn't enough.Like I just loved how easy it was to refactor and use. you don't have to split up your code from your template. Like in rails, you would have a helper. So you've got like, here's your template, but then you might have a helper, a totally separate file. If you've got something that pertains to just that page, you can just extract a method.It's right there. But this also made it so we can do layout without any special work. Your layout is basically a class. You would say, here's my class with the head. It renders the head renders HTML body or whatever. And then it calls a content method or a sidebar method or whatever else, and your page.So if you wanted to render a list of users inherits from that class and implement a content method or a sidebar method. And so when that's rendered out, it just calls those methods. So we got all of that for free. If you look at our view rendering code, it's 50 lines. because basically we use a macro and give it a list of tags, like, you know, paragraph H1 H2 whatever, and generate a bunch of methods.And that's basically it. So from an implementation perspective, it's extremely simple. Plus, you get all these niceties around refactoring is super easy. It's super easy to tell what a page needs to render at the top of the page. You just say, you know, I need a user. I need a paginator. I need a current user.So you know what that page needs. You don't get that with a template. and you get all the power of crystal for rendering layouts however you want. that all basically came for free. So it was kind of a happenstance that templates weren't working and this has worked out better people, a lot of people when they see this, they're like, what the heck is this?I hate it. And I always just say, just give it a try. Just give it a try for a little bit. So far. One person has said like, okay, I don't like it, and you can use templates if you want. We've actually built that in, but everybody else is like, now that I've used it, I love it.Jeremy: What it sounds like is in a lot of, JavaScript frameworks, for example, like react, there's this concept of components, right? And so you can create, what looks like new HTML tags, but really has. some other HTML in it like let's say you have a a list of, uh, businesses and maybe you have a component that would have, all the business details in it. it sounds like in the case of lucky, you kind of can do the same thing. It's just that your component would be in the form of a crystal class. And so there isn't any new syntax. and you're not mixing, different languages. Like you're not mixing HTML and JavaScript. Instead, everything is just using crystal.Paul: exactly. you have two options. You can extract a private method cause sometimes it's just a small thing you want to extract only used by one page. Just do a method. If not. Uh, extract a class. And the cool part about all of this is that you don't need to restructure anything. Meaning you can start with everything in one method, in your content method, and then you can pull out just a little bit into a private method.And then if that's not enough cool, pull that out into a class so you're not forced into just pulling out classes all over the place if you don't need one.It really worked out kind of really well because it also makes testing easier. You can pull out a class component that just does one thing and you can instantiate just that component and test just that HTML. And once again, this is very easy because it's a class you call it and run it like any other class.And so that's been a big goal of Lucky is try to reduce, and this also comes down to the whole like convention over configuration is how do we just make it so there is no convention. It's just intuitive. Like if you know how to extract and refactor a crystal class, you know how to extract and refactor stuff for a page in lucky automatically. Um, and I mean, of course there's still some degree of learning and experimentation, but it's the same paradigms. if you want to include methods in multiple pages, use a module just like any other module. So that was very much a goal. And that's part of, uh, other parts of lucky, for example, querying in something like rails. The model is for creating, updating, reading, everything. In lucky you do create a model and we use macros to actually generate other classes for you, but you have a query object that is a class. Jeremy: What am I passing into my query object what does that look like? Paul: Let's say you have a user by default, it generates a User::Base query. So basically you have this new object namespace under the model. And by default, the generators generate an another file.And basically what that does is it creates a new class called user query. That inherits from that user based query class. What you would do in your controller action or anywhere, uh, say user query dot new by default. That just gives you a new query that would query everything in the database. Unless of course you overrode initialize and did something else. Then it would use that scope. so if you want him to further filter down, you would call, for example, if you wanted the name to be Paul, it would be user query dot.new.name parens Paul as a string. Because lucky generates methods for every column on the model with compile time guarantees. So if you typo that method, it's going to blow up. If you've renamed the column later, it's going to blow up. if you accidentally give it nil, it's going to blow up and tell you to use something else, but that's how you would do it.You say dot. Name is Paul. Or, uh, we also have type specific criteria and methods. You can do things like dot age. Dot. G T for greater than 30. And so you have this very flexible query language that's all completely type safe. So in your scopes, if you wanted to do something like recently published for a post, inside that method, you would do something like published at dot gt at.gt for greater than one dot week dot ago.And you can chain that. So you could do post query.new dot. Recently published dot, authored by Paul or whatever. So that's basically how it works. Um, you just have these methods that are chained, that you can build upon in pretty much any way you want.Jeremy: In a lot of applications now, people use JavaScript frameworks, whether its react or Vue or angular, what does integrating with JavaScript libraries and frameworks look like in lucky?Paul: I think easier than a lot in the sense that you can generate a lucky project with. different modes. So when you initialize a project, you can use just the command line with some flags, or the default is to walk you through a wizard, which will say, do you want API only? In which case, you know, it won't even have HTML pages or the default, which is a full app.What that does is it generates Webpack config for you. Um, it sets up your public assets and images so that they can be copied and fingerprinted. and so out of the box that already has a basic web pack set up for you that handles CSS. Um, it handles most of your ES6, JavaScript type stuff that people typically like.That's just handled out of the box. if you want to include react or vue. You would include that just like any other Webpack project in terms of building it. Um, and it's actually a little simpler. We use Laravel mix on top of Webpack, which is basically a thin JavaScript layer that calls Webpack underneath the hood.If you want a full single page app. That's also totally supported. Um, you would basically have just one HTML page that, you know, has the basic HTML and body tags and within that Mount to your app. So whatever that is for your language in vue, it might be, just a tag that's like main app.And then in your JS you would initialize, um, that tag with your app. And we have fall back routing so that you can do client side routing if you want. It's not particularly well-documented, which is the biggest problem. Um, some people are helping with that cause a number of people have done react and view.And so, um, hopefully those will be fleshed out a little bit more, but it's totally supported. in the longterm though, we've got plans to make it, so you don't even need those types of frameworks quite as much. since we already have class components and a bunch of other things, uh, I'm working on a way to add type safe interactivity to HTML.So you're not writing the Javascript, you're writing crystal for the most part, and it can interface with Javascript and you can run, you know, use react and vue inside of it. But a lot of your simple open close, if anything like that is going to be handled client side, but written with crystal and server interactions will also, those will be sent over an Ajax request, but will also be typed safe when you call the actions and do all the HTML rendering similar to live wire for Laravel or live view by Phoenix. But with some. Some differences that's not done yet, but it will be, and I think it's going to be really exciting. I've got a proof of concept up locally and uh, it's really awesome.Jeremy: We had a previous episode on live view and I think the possibilities of things like that are really interesting of, of being able to not have to have this sort of separation between your JavaScript front end, and your server backend yet still be able to have, the kind of interactivity people expect.Paul: Yeah, I think it could be cool. and that's also where speed comes into play. When you're doing interactions like that, you don't want to wait even a hundred, even 50 milliseconds. Is noticeable for those types of interactions. And so Phoenix also fast, really fast, template language. Uh, basically it gets compiled down to elixir, and so that helps a lot.Um, I do think there's some big flaws that I've seen in some other implementation. Well, I don't want to say flaws, that sounds a little overly harsh, but things that I personally, are just deal breakers for me. And one of those is some clientside interactions have to be instantaneous. I just have to be, if I click on my avatar on the top right, I expect the menu that has settings and log out to be instant.If there's any kind of latency in the network and it takes 200 milliseconds, even. That's going to be a weird interaction and it's going to feel like your app is broken. And of course that's exacerbated by people, not in your country. This is another problem. People are doing these things, deploying servers in their own country.Put a VPN in front of your computer in Australia or even the UK, 400 milliseconds. That's just, you can't do that for a settings menu or for opening a modal. And so there needs to be some way to do those interactions instantaneously. Live wire by Laravel, the same guy that wrote it, built our Alpine JS.Which is kind of, it looks a little bit like vue, but it doesn't have a virtual, DOM it operates with the Dom that you generate. That's what it uses for client side interactivity. So you can do the server side stuff, which I mean, if latency's there, you're, if you're submitting a comment, look, there's no way around it.You've got to hit the server. But if you're opening and showing something at a menu, a tab, a modal. That's instantaneous and is handled by Alpine. So lucky actually going to use that along with our own server rendered stuff to do client side interactions instantaneously.Jeremy: So Alpine, it's a JavaScript front end framework, you said, similar to vue. without the virtual Dom, and it sounds like what you're planning is to be able to write crystal code and have that generate Alpine code. Is that right? Paul: That's correct. Cause it's mostly in line and it can't do everything. But most of what I want from client side interactions are typically super simple things. I want to open and close something. I want to show tabs. And those are things that Alpine's incredibly good at because you don't need a separate JavaScript file.We can just generate something that says, it uses X as it's kind of modifier X dash click toggle the thing. True or false, toggle open to true or false and X if or X show and then if it's open or not. Those are things that we can very easily generate on the backend and make type safe because we can say, you know, this has to be a boolean and here's the action.And all those things are then type safe, but you can still do JavaScript if you want, so you can still use JavaScript functions in there with your Alpine if you need to.Jeremy: Yeah. That just sounds like the distinction between that and like a live view or a live wire is that my understanding is with those solutions you're shipping over basically diffs in your HTML, and that's how it's determining what to change. Whereas you're saying like, you may still have some of that but there's certain interactions where you just want to run JavaScript locally on the person's client, and you should still be able to do that even if you are doing this sort of sending diffs over the wire, for other things.Paul: Yeah. Exactly. Alpine's made for that. The biggest key differentiator between Livewire live view is the type safety, all those nice things that you get in lucky you're going to get also for your client side interactions. So if you have an action and you have a typo or something, it's going to blow up.It's going to tell you if you forget something, if you've missed the wrong type. I mean, and this is something that's very hard in the front end world because you either have to run an automated test to make sure you catch these or the worst. You have to open up the console. Because like, why isn't this working?I don't know. Now I have to dig into the console. It's not even where you typically want to see logs, and so being able to shift that to where you're used to seeing errors and before you even have to open the browser, I think that's going to be a huge deal.Jeremy: I think on the server side, testing is pretty well understood in terms of, you know, especially if you have API end points, or you have just regular server code, like people know how to test that. But on the client side, there's like so many different ways of doing it.It feels like, and a lot of them involve spinning up browsers and, um, it can get kind of complicated and so, yeah, it'll be interesting to see if you can shift more of that to the, the server environment that a lot of people are used to.Paul: Yeah, I think it will be cool. We'll see how it goes and yeah, I do think there's definitely complexity that comes with moving it to Javascript, especially if you have a single page app cause then you need to spin up an API. You need the the server and an API. When you use your Cypress tests or whatever, or a lot of people mock the API, which sometimes is fast, but can get out of sync, in which case you lose confidence in your tests.So having it in one spot, is I think really great. And we do have the capability to run browser tests that's built into lucky because I think it is still good to have at least a couple smoke tests for your critical paths. To test the happy path. Um, but I mean if you can write fewer of those, that's great cause they take forever to run.Jeremy: For sure. Yeah. Um. In lucky, there's a lot of features that in other frameworks would be not usually be included. Like for example, there's authentication. you have this setup check script to see if your app has all of its dependencies, things like that.I wonder if you could sort of explain sort of how you decided what sorts of features should exist in the framework versus being something that you'd leave to the user to decide.Paul: I think things If there's no downside for one thing, if there's no downside, only upside and almost everyone would benefit from it, I want to include it. So that's, for example, the system checks script. Um, we also have a setup script and that's what we tell people to use. Instead of saying like, first installed yarn and then run your migrations and blah, blah, blah.No our documentations don't even mention that. It's like run script set up. Um, and the idea there is, it serves as kind of a best practice. It kind of pushes you into things to say like, Hey, put stuff that you need in here. Then we lay it on the system check, which also runs before setup. And also every time you boot the development environment, um, where it'll check, Hey, do you have a process manager?Which you need. It'll check whether Postgres is installed and running, because that's required. so if you go back to kind of that criteria, it's useful to pretty much everyone. Meaning like, if Postgres isn't running and the app's not going to work, everyone would need to know that. Um, and it doesn't really have a downside.if you don't want it for whatever reason, you just delete it. Or stop running it, that's not a huge downside. That's like, you know, one click. So that's part of why that's included. I don't like spending time on things that aren't delivering actual real value.So I don't like spending time figuring out why my local environment is not working or why it was working and now suddenly isn't. And with something like a system check that makes teams happier in the sense that, let's say all of a sudden ads somebody adds a new search capability and it requires elastic search, and I do git pull from master, do my feature as soon as I boot the app, if they've added something to system check that says, Hey, you need elastic, it's going to tell me it's not going to just blow up.It's going to be like, Hey, you need elastic search now. Install that and run it. These are the types of things that I really think are gonna save, a lot of time in terms of auth. that's another one of those where it's like, so many people want it and it should be easy and simple and not like five different ways to do it, but not everyone wants it, which is why, we made it optional.You choose in the wizard, like if you don't want auth, fine. I guess that most people generate it with off. I know I do cause I need it. And the thing is, we also changed how auth works in the sense that it's mostly generated code. It's not just a bunch of calls to some third party library. So what that means is it is easy to modify.So if you want to add email confirmations or invitations or anything else like that. You're not mucking around in some third party library. It's code generated in your app that you can see and modify. So it doesn't lock you into anything. It's very flexible and it helps you get off the ground running.And that's why that was included. Uh, and I'm sure we're going to have other stuff that may be included or at least an option of being included in the future.Jeremy: Yeah. I think, one of the conversations that people are having now is particularly in the JavaScript ecosystem.you end up pulling in a lot of different dependencies. You end up having to make a lot of different decisions. And so it's interesting to sort of see, lucky kind of move back and in the direction of say, a rails of trying to kind of include the things that you think probably most people building an app are going to need. Paul: Yeah, it's a little more in that direction. I think on the flip side, rails is starting to include so much that people are starting to get mad almost, and it's like so much that you're like, what is this? What is happening. So we want to strike a balance there. And so part of that is being very careful about what is included.I think some of the things that are included in rails could just as easily be added after the fact, meaning, 20 minutes of work and you can add it. Those are the types of things I probably would not include in lucky if it's 10 20 30 minutes to, you know, add it and modify your app. And only 50% of people even want it.We're probably going to just say, here's a guide on how to do it and make it easy, but not do that as a generator, if that makes sense.Jeremy: What's an example of something like that that would be pretty easy to add in after the fact and doesn't necessarily need to be included?Paul: Um, well, in rail six, it's coming up. They have this action mailbox thing that handles inbound emails. I'm pretty sure by default that is included. I could be wrong, so don't quote me on that. But I've been seeing a lot of Twitter stuff lately of people being super pissed about it, so I think it's there.Um, that's something I definitely wouldn't include because I think I've written one app ever that uses inbound emails. I mean github does too, but I have not written that and a lot just don't have that. So it's odd to include it, especially given the fact that it's not particularly hard to set up yourself. I think based on what I've seen, or action text is another one where it has ways of making rich text editing easier. That might be something too where. It could be added on later that I think, at least as a little bit more merit, because I think it's fairly common for at some point to be like, we need a rich text editor.Um, but those are the kinds of things that I would probably push off. And it's not a best practice either. Meaning I think it's smart that it has active record by default and chooses a database for you. Um, because it's best practice to just use active record. Right. And you're gonna have the best time using active record.Cause that's what everyone uses. So including that makes sense. But yeah, something like action mailbox is like what's the benefit in including itJeremy: Yeah. Just because the majority of people who are writing applications, they'll never need that inbound email feature. Uh, as opposed to, your example of authentication, like probably the majority of applications people are building will have authentication in them.Paul: Exactly. Yeah. And it's something that's hard to add, meaning um it touches so many parts of your application. And because we are generating stuff, it's not easy to add after the fact. but stuff that. Is easy to add and easy to remove that. Another criteria is how easy is it to remove it? So we include a few default CSS styles, but super easy to remove.It's basically like you go to your application dot CSS, it's like delete everything below this line. You delete it and it's like you're done. But it's nice because it makes it look decent and not like a horrific, ugly thing when you start your app, but it's easy to remove. And so that's something, for example, that we also include by default.Jeremy: That's also, I think the distinction between something that's generated code and, or configuration that the user can see. Um, I mean, I think your. Set up scripts and your system checks, scripts. Uh, one of the things that makes those kind of more straightforward is the fact that they are in your code base and they're their bash scripts, right?So, if you want to modify it or you want to remove it, they're, they're kind of right there. Whereas something like a action text or action mailbox is probably in like. the rails gem, right. It's in the library, so you don't even see it in your code base. I guess that would be the distinction there.MaybePaul: Yeah. Or you might, but you don't know why it's there or what it does. Or. Yeah. Another concern is how many things does it hook into? so for example, one of the big things is, like I said, do default styles. How many places does that hook into things? Just one, you go to your, your main CSS file and delete it, but there's a way to do that that I don't particularly like.I've seen some people, for example, use bootstrap or any framework. It doesn't matter what it is. The problem with those is it also modifies the generated HTML and the scaffold. Cause by default it's adding classes like column three, medium button, blah, blah, blah, blah, blah. If you don't want to use bootstrap, you have to remove bootstrap and manually go through all of the generated HTML files to remove the bootstrap classes.And so that's like a key difference too is how easy is it to remove. And we've really want to only add things that are easy to remove or really hard to add.Jeremy: What, what is the, uh, adoption of lucky look like? Do you know of people using it in production currently?Paul: Yeah. I don't have exact numbers. Um, which I think is good because it. Reduces anxiety a lot. Not knowing is like, is it going up? Is it going down? But people are using it in production. a lot from the very early days of crystal, one of our core team members, um, Jeremy, he's been using it at work for two and a half, three years.And they've had great success with it. They replaced some of their rails and microservices with lucky, uh, originally for the performance boost. And I think this is common. They stay for all the nice type safety and the reliability they get. Um, it's hard to explain with just words, but then you use it and you see an error and we try and make them nice.Not all of them are. But we try and make it nice and people go, Oh, this is nice. Or people are annoyed that they see this compiler error and then realize, Oh, wait, actually did catch a block. So, but they're having great success. Um, big performance boost. like something like they reduced their number of servers by like 70%, and their response times got cut down 60 or 70%.So yeah, they're having great success and then a few other people are building client projects using lucky. I don't know what they are. Some people, there's just not, they can't say to the public, unfortunately. But yeah, people are using in production, which is really exciting.Jeremy: Looking at, the crystal community, what does that look like? you know, is a pretty active what are your thoughts on the community?Paul: Yeah, it's quite active. Um, they've got sponsors, quiet, quite a few corporate sponsors, so they're making decent money to help fund development. They're aiming for 1.0, I don't know exactly when, but they did a blog post. I'm saying it's going to be soon, and I've talked to them in person about it, but I don't know how much you know, I was supposed to say, but soon.Um, which is fantastic because then you're not going to have to deal with the breaking changes, which have definitely been happening the last two years. And I think it's good because the language is improving and changing things. But once 1.0 hits, people are going to be able to jump in and they're not going to have to update their apps every 3 months or whatever.Um, but yeah, a lot of participation, and the sponsorship money goes a long way. A lot of the development is based in Argentina and the dollar is super strong over there. so meaning if you've got corporate sponsorship in dollars over here, that goes a really long way towards the development. Um, and they're all super nice.I've talked to a lot of them in person. Um, super nice, super smart guys. The community itself in terms of forums and chats, that's where I'm a little hesitant. It's, it's active, but I think not particularly welcoming for newcomers. just really strong personalities, very smart, but very strong personalities, and I would say.It may be better to come to the lucky chat rooms. We're very strict about our code of conduct and not about nitpicky things, but just in general that, you know, you talk to people with respect and empathize, and we're not the type of people where you come with a question and we're like, well, did you Google it?We're going to try and help you. And so I think it's a very welcoming community. And even if you're not using lucky. Feel free to hop on our chat room. Um, if you go to the lucky website, there's a link.And uh, yeah, we're pretty nice over there. So things are moving forward. We're trying to get to around the same time as crystal. Um, maybe a little after, but I think that'll be a big milestone.Jeremy: it's interesting talking about the community. Because I think when you think about Ruby one of the big parts that attracts people is not just the language or the framework, but it's, you know, having an inclusive community, having people that are really friendly.so it's good to hear that, lucky are striving to do that. Like why is there that divide?Paul: Uh, I'm not entirely sure. I mean, part of it is I am a sensitive person, and so I am kind of trying to create the community that I want, which may be actually way more, upbeat and positive just because, I want new people to feel comfortable. and I think maybe part of it is, with crystal, they don't have that much time, I think, is part of it. And so it's easier to brush stuff off. Some of it could be just that they don't care about the same things that I personally do.There's nothing actively bad going on. It's just I prefer things rather than to just be okay or average. I want it to be exceptional. And a place where it's just like, don't worry, you can say something. If it's, you feel it's dumb. We're not going to be like, pile on.We're going to be like, Hey, it's fine and here's maybe an alternative. so yeah, I mean, go to the crystal rooms. I still do. I still get help. There's a lot of really smart people. Um, you just gotta put on like a little thicker skin and be prepared for like, why do you want to do this? Have you tried this other thing?Have you done this other thing? In a way it's a good thing because they're making sure that like you've tried your different options and you're not just asking to do something that's a horrible idea, but it can make people, I think, feel like their ideas getting attacked or whatever.Um, so that's what I mean by part of it is just like, if you're sensitive, that's gonna come off as probably harsher than it was intended. Um, but you can still get a lot of help.Jeremy: Yeah, I guess it's just trying to find the right level of, yeah. I don't know what the word would be, but, yeah. Making people feel comfortable.Paul: Yeah, I do have a really high bar for that because like I am sensitive and I grew up when I learned to program all online with books and with forums. And I remember how hard it was as a new developer that didn't know best practice and people would be like, why are you even trying to do that?It's so stupid. And it's like, dude, I've been programming for like six months, calm down. And, I think it's common. I think, I mean, that happened in the Ruby forums happened in the rails forums. it's a common thing, I think across the internet and various communities. So it may not even be that Crystal is particularly bad.It's probably a lot like most communities, but we just want ours to be exceptional. Um, in terms of making people feel welcome and you know, if someone has a bad idea and air quotes bad because maybe it's a great idea and we just don't have the context, but if it is a bad idea, we're not going to say, why are you doing that?Blah, blah, blah. First, let's help you solve your problem and then talk about how might this be better? Maybe there's a better way to do it. And. It just feels a lot better. People are more accepting to have your feedback when you're not just immediately jumping on them and say, why are you even trying to do that?Um, and so I think that's important. Uh, yeah.Jeremy: Yeah. I mean, I think that probably applies to really all projects, right? Like they could all kind of stand to learn from some of that. And Kind of see it from the other person's perspective who doesn't have sort of all the. The same knowledge that you know you've been building up and maybe they can bring you a new perspective as well that you didn't, you didn't even think about.Yeah.Paul: Yeah, totally. I mean, we've changed stuff in lucky a lot of stuff that I was pretty sure about and they asked if it could be done differently, shared their use case, and it's like, Oh yeah, I made a mistake. And so it's good for everyone. Like if you show a little bit of vulnerability and openness, you're much more likely to learn.And you're much more likely to learn new and novel things because the people with, the strongest opinions are often the ones that have that opinion based on some principle they read about or a talk or something else. It's the quiet people that are like, Hey, can we try doing this like a little differently?And you're like, Whoa, I've never thought of this. Because no one else has, but you're new. You came up with this great new, innovative idea and he felt comfortable sharing because we're not just shooting people down constantly and so yeah, I wish more communities did that in general because it's mutually beneficial.Jeremy: That's kind of a good place to start wrapping up, but where should people go if they want to learn more about lucky?Paul: First place, luckyframework.org. That's the main website. It has guides. it has blog posts that you can follow or subscribe to with new announcements. Uh, and it has a link to our chat room as well as the github. So that's where I'd go. Feel free to hop on the chat room anytime. Um, we're all really helpful, and try and be nice. And so like, people shouldn't hesitate to run in there if they have problems. Um. If there's stuff that's confusing. Um, feel free to open an issue on lucky. We have a tag that's like, improve error experience. So we're, we have dedicated stuff just to do that. Yeah. In fact, if you start a lucky project, then you get a compile time error when you first start or are fresh on a project, it says, Hey, if you're still stuck, go to our chat room and ask for help. Everyone should feel free to do that.Jeremy: Very cool. and how can people follow you and see what you're up to.Paul: @PaulCSmith on Twitter. Probably the best way to do it right now. Maybe one day I'll have a blog or something, but right now it's Twitter.Jeremy: Cool. Well, uh, Paul, thank you so much for coming on the show.Paul: Yeah. Thanks for having me. I really enjoyed it.
undefined
Aug 26, 2020 • 1h 14min

Life after JPEG with Henri Helvetica

Henri is a frequent conference speaker and organizer of the Toronto Web Performance and JAMStack meetups. We discuss: Managing images with features like lazy loading and the picture tag Handling varying network conditions on mobileMaking designers a part of the performance conversationThe WebP image format that could replace JPEG and PNGWays the GIF can be an MP4 in disguiseHow lighthouse has given websites a visible target for performanceWhat we can learn from "lite" news sites Conference Talks A Decade of Disciplined DeliveryShape Of The WebMoving Pictures: A Snapshot At the Future of Web Media Related Links @HenriHelveticaCloudinaryPicture tagNetwork Information APIWebPResponsive images done right: a guide to picture and srcsetUse srcset to automatically choose the right imageMozilla: Improving JPEG Image Encoding (Mozilla explains why they want to stick with JPEG in 2014)The Great JPEG 2000 Debate: Analyzing the Pros and Cons to Widespread AdoptionHow JPEG XL compares to other image codecsServe images in next-gen formatsJPEG XRHigh Efficiency Image File FormatUsing HEIF or HEVC media on Apple devicesAVIF for Next-Generation Image CodingAV1 & Media CodecsWebP is now supported on Safari 10 (WebP support was added in a Safari beta but then removed)Safari 14 Beta Release Notes (4 years later, WebP is officially added to Safari)HTTP Archive Almanac: Image format usage (Shows the relatively small footprint of WebP)Native image lazy-loading for the webHow To Defer, Lazy-Load And Act With IntersectionObserverHow Medium does progressive image loadingAbove the fold in web designAn update on mobile CPUs and the Performance Inequality GapWeb Page TestLighthouseCNN liteNPR text onlyUser Timing API - Measuring User Experience Performance People mentioned during this episode @burkeholland@kornelski@andydavies@patmeenan@slightlylate@souders Music by Crystal Cola: 12:30 AM / Orion Transcript You can help edit this transcript on GitHub. Jeremy: [00:00:00] Hey, this is Jeremy Jung and you are listening to software sessions. This episode, I'm talking to Henri Helvetica. He's a freelance developer with a focus on performance engineering. He's also involved in the Toronto web performance and JAMstack meetup groups. And we discuss why images and performance are so tightly tied together. We also went deep into what life after JPEG might look like with the introduction of formats, like web P. And we talk about tools that can help you during your web performance journey. Henri is a big runner so i asked him if he started his day with a run. Henri: [00:00:36] Thank you for the introduction. And good morning. And with regards to a run, I wanted to first thing in the morning, and as we were talking about, getting up early just moments ago, I have my alarm set for 6:30. I tend to sort of open my eyes up around quarter to six, and figure out like how this run is going to go, but it was raining this morning. so I was a little upset, went and looked outside. The rain stopped by 7:00 AM. I was thinking, okay, maybe I should head out now. And as I was getting ready to head out, the rain started again and I thought to myself, okay, it's not going to happen simply because I knew I was doing this podcast. So I want to be back in time and fresh. And, and afterwards, I think I'm going to watch, this, Microsoft event they have online, MS create. yeah, run does not look good today. And it's funny. I was speaking to Burke, Holland from Microsoft. He said that he sent the clouds my way, knowing that it would force me to stay in. Jeremy: [00:01:34] They are a big cloud company, right? Henri: [00:01:36] That's a good one. I like that. That was good. I'm going to have to keep that one. Jeremy: [00:01:42] You're, you're pretty deep into the web performance space. What are some of the biggest mistakes you see people making on the front end? Henri: [00:01:52] I mean, web performance I, I consider it a bit of a dark art. There's lots involved and, and much of it may not seem, very clear to the sort of like average developer at times. but with any auditing that takes place, whether it be web performance or accessibility or, UX, overall, you're always going to have some low hanging fruit, and, One of those fruits is, image management. and I think that, you tend to find a lot of people sort of disregarding the importance of making sure that images are set properly, as a resource loading on your page. and. It's important for a number of reasons. most notably is the fact that it's always, absolutely going to be the heaviest resource on your page. Okay. Barring video. and you know, video in the last say couple of years, specially, especially this year has become a lot more prominent. So I mean, that's a bit of a different conversation, because, you know, you could quite often find pages with no videos. So I didn't want to go too deeply into that topic, but, you know, you will find images 99.9% of the time and images are challenging. Image management has become a lot more complicated, for a number of reasons. Retina screens brought in a particular challenge with regards to how to select the right image. and then you also have more than ever people are really paying attention to, connectivity, understanding that, connectivity may vary along like a five minute period, what was 4G at the start of your walk might suddenly downgrade to like very poor 4G or even like moderate 3g. Then you might go into your home and back out. And so you have varying conductivity that, ultimately the site doesn't care about. It's like, Hey, just load this image. You have these things to take in consideration and you luckily have some very brilliant engineers out there that are trying to make these accommodations. So, I would certainly say, images, Have been, are, and potentially will continue to be, one of the bigger challenges in terms of a low hanging fruit. Jeremy: [00:04:15] I want to go a little more into images. If you have the most basic case. Let's say you're not building a single page application. You're building a traditional, like just a document website. What are some of the ways that you should be treating images? you mentioned retina, how do we ensure we're only sending retina assets to people with retina devices? Should we be loading images in a lazy fashion? Like what are some of the best practices there? Henri: [00:04:47] The ultimate best practice, at this point, and it's a bit of a cop out, but, it would be to, outsource the work. and I say that simply because I think it's, it's become, enough of a challenge that there are some companies out there that are solely set up to do that work for you? Obviously people like Cloudinary and there's a bunch of others as well, that have been very upfront and outspoken in their need to let people know that this is the kind of work that they do. And, that's their specialty now. Barring that, there are a number of ways you can, look at, managing images. Obviously I'd say, one of the earliest revelations that came along the way when dealing with images and dealing with a retina, non retina, when we had and obviously, formats as well. picture tag srcset, and the ability to pinpoint what you wanted to send under what conditions. And, that was fantastic at the time. And, and everyone felt like, okay, we've, we've come up with a great solution, but along the way as well. What ended up happening is that you had these ever growing blocks of code. And I believe it was Brad frost once, posted, I think a screenshot of a code block just to handle, retina, not retina, et cetera. And it was so huge. He just sat there and was like, I'm not going to do this, there's no way, I need this massive block of code just to serve a couple of images under the right conditions. Things like that came about, and obviously as much as they did work, there was a sense that, things need to be a bit more simplified. I mean, people are still working on that, but I mean, what that also didn't do is. take in consideration, things like network conditions. And so you can have like this amazing, beautifully scripted block of code for images, but that still didn't take into consideration whether or not you were getting proper, bandwidth and, good, round trip times and whatnot. And that's where things like, network information API came around and whether or not you want to serve a particular images under particular conditions. And that's where it starts to get pretty complicated. Jeremy: [00:07:23] And this code you were referring to, this is all JavaScript? Henri: [00:07:27] Oh no no. Th th this is all just like classic HTML. I mean, we've not, we've not even gone into the JavaScript element yet, But, no this is all just like straightforward HTML that was there for you to sort of manage, images as best as possible. And, and like I said, the code block could just grow very quickly. If you want to just have like all your options covered, right. or conditions should I say? but again, none of that really delved into the idea that, Oh, we have variant network conditions too. And that sort of threw like an additional curve ball into, what seemed like very simple rudimentary work, you know, loading up an image of a cat, but, it's not the case anymore. Jeremy: [00:08:15] In CSS, for example, there are things like media queries where we could say, the device's screen is this size. So I'm going to send them this type of image. are, are these the types of things you're talking about when you're talking about serving different images to different browsers and devices? Henri: [00:08:34] With regards to different browsers specifically, the picture tag was actually a bit of a revelation there because we had situations where at one point. There was kind of like a fractured, landscape of, image support. you may remember, at the time when web P was starting to, make its way into the conversation. Even though the webp is like 11 years old, but I feel like, you know, even to this day, a lot of people, like webp what's that I'm not sure. And I remember a couple of years ago when I went on this sort of like image, you know, format management discussion, at conferences, there were people who had no idea what the webp was. To go back, in history, when the web P was really started to be introduced by Google and, supporting, browsers with the blink engine, there was a moment in time where, Mozilla felt that, we had not extracted all we could out of a JPEG. So their sentiment was that, a sudden introduction of a web P might've been premature. And in fact, there was a, a blog post that it described their decision into maintaining their sort of support, and additional research into, getting, better compression over the JPEG. A blog post that has since vanished, but I think if you go to the Wayback machine, you might be able to find it. And then on top of that you had the idea that the JPEG and JPEG 2000 and JPEG XR were sort of still out there floating around for people who want to experiment and really, really dive in a little more, because at the time you had CDN companies like say Akamai that we're working with, big retailers and, they had obviously a lot invested in making sure that, they could squeeze all the data they could, out of, certain assets like images. So you could have say a website, like, I remember in one of my talks, I gave this example like forever 21, I could talk about a company that's, gone bankrupt. So it's not like I have stock in that company. Jeremy: [00:10:44] Yeah. They're not going to come get you now. Henri: [00:10:46] Exactly, exactly. Right. It's like, here's a couple of pennies, man. Let me give me some, give me some stock. but, I, I, remember in my talk, I showed that, in devtools in different browsers you saw, you know, the JPEG 2000 being served and you saw the JPEG XR being served. You saw obviously, in Mozilla's case, a JPEG being served. Now I believe in Chrome. you were getting the webp. So the picture tag definitely helped with that, you know, with people who really want to be, very focused and, and trying to serve the best, format and most compressed. option possible, you know, very, disciplined delivery of assets because that's what it is at the end of the day. Trying to be as disciplined as possible and trying to find the absolute best possible, solution to, to sort of lessen the load. Jeremy: [00:11:38] Is WebP kind of the equivalent of a JPEG. It's another lossy, image format, but is perhaps more efficient Henri: [00:11:46] So the WebP is a very interesting format. So, history. WebP came from, a video format. So the WebP is actually a product of the WebM. Some of the more interesting and more, data efficient, image formats are all actually stemming from video, which is really interesting. So, the HEIC, which came from the HEVC, and, and then very future conversation, AV1 birthed the AVIF. AVIF and again, that's video to still image, but let's get back to the WebP it was made for the web, essentially. Visual fidelity it may not be the best format, but in terms of what is best for the web resources being transmitted down the wire, the WebP makes a great case. and I'll list a couple of features real quickly, obviously a very aggressive codec compression is 10, 20, 30%, better than PNGs. And better than JPEGs. Their chroma subsampling, is locked at 420. So for those who may or may not know chroma subsampling, basically it has to do with, it's sort of like the removal of certain colors that, may not be super perceivable to the eye. And so the fidelity remains to an extent, and essentially they removed some data that you probably, you know, yeah the average person wouldn't really catch. And it also had transparency, which made it obviously a lot more attractive because obviously the JPEG didn't have that. And, at one point, actually, and I've mentioned this a few times and you know, I was lucky enough to have a conversation with a Chrome PM. just about four years ago, he had mentioned to me The web P they had specifically the PNG format in their sites, as the felt that, feature for feature, they're aligned well enough that they felt that they could replace every PNG on the net with a webp. I also say that because the WebP came in two flavors lossless and lossy. Obviously the lossy one being the most attractive, but there is also a lossless option. So for those who really want to hang on to that fidelity and they, they refuse to let them go. There's a lossless format as well. So, the WebP on paper was an attractive format. But early on, some of the challenges, was encoding, was support. For people who are just so used to PNGs and JPEGs and, and God forbid GIFs, the majority of the software out there had just endless support for those three, even SVGs, but the WebP not the case. There was some work involved in trying to get the WebP into sort of like the ecosystem, but it wasn't going to happen without some of the more proper software outfits not supporting it. Say for example, Photoshop, there's a bit of a potentially outdated plugin that's been around and now not even supported by the original company that put it out, for Photoshop. And, you can sort of go through the litany of other sort of popular software, outfits that may or may not be supporting it to this day. Jeremy: [00:15:15] In terms of the best practice for images, you said there's a picture tag that will let you use a different format depending on the user's browser. So if you were using Chrome now, I suppose that it could send the user a WebP. but if they were using Safari, maybe it would have to send them a JPEG or PNG. Henri: [00:15:39] Yeah, it's funny you should mention Safari. I can go back and finish up my WebP story. Web P was slowly gaining and I do mean slowly gaining some recognition, I don't want to list it as popularity, but some recognition. Mozilla had doubled down in the JPEG. You had the WebP, JPEG, for sure. And then whatever else you wanted to use that was on the fringes of popularity, like the JPEG 2000 and the JPEG XR, 2000 being supported by Apple, and XR being supported by Microsoft. Then a significant moment in format history, Mozilla had a moment of clarity and they reopen, the bug to provide support to the WebP. And which was a bit of a, you know, I don't want to say a shocker. But, had sort of decided that, Hey, you know what, we probably need to do this. So they reopened, that bug, again in history, and sort of significant in a sense that for like a week, webkit, specifically Safari supported webp. And it was like a very bizarre moment. And he got pulled ASAP. It was like grand opening, grand closing, literally. And I could send you that link and this had an article about it. that was like, you know, no knew what was going on. But at one point we'll say about a year to two years ago, I think. First of all, you had Chrome and Blink engine supporting webP. You had Mozilla who had finally announced that, it was in nightly. Also somewhere in between, Edge had moved over to Chromium and they sort of quietly announced that they had WebP support. So, at the top of 2020, you had three of the four majors all supporting WebP and then hell froze over about a month and a half ago, at WWDC at Apple headquarters. And they announced WebP support for, Safari 14. So basically in about a couple of years, you went from one to four of the majors supporting web P. and it is significant in a sense that, A well known, image researcher, developer, engineer, Cornell. I can never pronounce his last name, but, so I won't, but, I'm a big fan of his work. He's actually the author of imageoptim. image, optimization tool. He put out this tweet saying that he felt some point next year, WebP could essentially be the only format you need. and it actually does make some sense, because if you're going to have the four majors on board and, and all the other browsers who are running off the blink engine, You know, we could, we could see the, the web P format climb significantly in, in presence on the web. Because as of right now, if you go to the HTTP archive, WebP is still in relatively trace amounts, on the web. and again, for a number of reasons from like tooling to developer, knowledge. It's going to be pretty interesting to see what happens. Jeremy: [00:19:09] WebP it sounds like it's going to be able to replace JPEGs and PNGs because it has that lossless option. How about, how about GIFs? Are we going to be able to have animated pictures in WebP? Henri: [00:19:22] The big WebP proponents will tell you yes. And, I mean, I'm going to go back to that statement you made with regards to, why the WebP is going to be, replacing the JPEG and the PNG part of the reason and I think specifically why it feels it can replace a P because I think essentially people will still reach out to the PNG as a lossless format for that transparency element, and no other image format out of the classic four, had that as a lossy option. So now back to what you were saying with respect to the animated GIFs, As much as I'm not a huge fan of the animated GIF, we have to live with the fact that people love them. Jeremy: [00:20:13] and they definitely love them. Henri: [00:20:15] they absolutely like imagine Twitter with no animated gifs. it's almost like it'd be like empty but it's, it is interesting because, For the most part, the animated GIF has been replaced by the MP4. Quite often when you go into dev tools and, and look at, The entrails of this GIF, you believe you sent off a, it's actually an MP4 now, that was done for a number of reasons, specifically, storage because, MP4 versus GIF in terms of storage, huge difference in terms of size and, and some of these GIF farms realized that very early and, you know, you can have, storage costs, balloon out of control, just cause, you know, you want to carry a GIF. That was part of it. I've actually never, I mean, I shouldn't say never, I've seen an animated WebP once. Like I'm assuming it probably loops as well. So on paper, yes. There would be probably an argument for that to take place, but also for that to take place, the services that are out there, giving you these GIFs will have to on their own, do the encoding. So you can sort of like drag this, this animated, image, and hopefully it will be a WebP that'll be the one, early challenge. And, and again, I get back to the idea that we as just everyday persons have the, the tools and the encoding capabilities either on our phones or our computers to just say, Oh yeah, I want WebP. So there is a bit of a developer education that's going to have to take place, let alone consumer education right? You know, the average person knows what a GIF is. Devs don't know what WebPs are. So I don't, I don't imagine, individuals will. So there's going to be that I think hurdle along the way. But on paper yeah, it would be, it would be, probably an apt replacement. Jeremy: [00:22:25] That is interesting about how you mentioned a lot of the times where we would usually use a GIF. We now use an MP4, because that makes me wonder when. Someone is in a discord room, for example, and there's animated images everywhere. even if somebody thinks they're using GIFs, those may actually all be MP4s. Is that, is that right? Henri: [00:22:47] Absolutely. I'll tell you a quick story. I remember, one of my earliest talks on images. I'd mentioned that and the next day. I ran into a developer, and a speaker actually, who was, at my talk. And, he mentioned how they didn't believe me. And they went into dev tools. And, it was like, So, for everyone who didn't see that I just made this sort of like a mind blown face. and, and they told me that like, Hey, I was suspicious of that comment and I went into dev tools and they had no idea. Jeremy: [00:23:27] Yeah. Henri: [00:23:29] I actually felt good for a minute, you know, but, but yeah, that's, what's happening. And, and again, we're talking about, The availability to still have, the animation, but to save on storage, and even though, a gift may have that classic choppy look and feel, and you're like, Oh, that's gotta be a GIF, but it's just the choppy look and feel as an MP4, it's being done, because they do have the capabilities. Now getting back to that WebP conversation, whether or not they'll be able to, get all that encoding done. and, and, and suddenly, you know, their terabytes or petabytes of GIFs are going to be turning into, WebP. I'm not sure, but we'll see. Jeremy: [00:24:15] Yeah, but it sounds like if they're being turned into MP4 files to the end user, it really doesn't matter. Henri: [00:24:24] Ultimately that is, the challenge, right. as developers, we're making sure that, you know, we are disciplined as possible, but the end user doesn't care, is it looping? Does it work? can I post it on my page? That's it, very early in fact, there was a situation where, Facebook who have been, very aggressive in, exploring performance, opportunities and how to save, data, Quite often they're very early adopters. There was a point where they were starting to serve webp and they found out people were often just dragging stuff to the desktop to share either with friends or somehow. And they're realizing that, the WebP was being supported in the browser, but nowhere else. And so there, there were some complaints. And then at one point, I think they were trying to do something a Chrome where when you drag the WebP out of a window, it would be encoded into like PNG a by the time we got to your desktop little things like that. But what that described is the user experience had to be absolutely seamless. People do not care. And they just want to know that the images went from their window to the desktop, or they could just share it with a friend, you know, an iMessage or whatever it is and that's it, that's always part of the challenge, right? Making sure that, that the users can have like a very seamless experience in sharing, in social media, Jeremy: [00:25:54] Yeah, that's interesting because it reminds me of an iPhone in a lot of cases, if you take a photo, it's not. A JPEG, it's a HEIC format and you send that to someone, who can't open it. And you're kind of like what the heck is going on. Henri: [00:26:10] Absolutely. it's funny, you should mention that because I remember when Heath and you know, that whole ecosystem, was being introduced, at this one WWDC. It might've been two years ago and you just saw Twitter, just kind of not explode, but just going through this, like, what's HEIF, what's HEIF or what's going on? Hey, Hey, and I remember I gave a talk within like two weeks about HEIF and. Nothing happened. Even within the, Apple ecosystem, that format wasn't even supported by Safari. I think it may have been supported by, maybe image capture, but it had limited support outside of the iOS ecosystem now. I don't want to sort of like get into like the iOS business, but, if you take your phone and you go into say, a timed shot, like three, 10 seconds, whatever it comes up as a JPEG, which is weird. And I believe the front facing camera, I don't think does a HEIC shot either, but. If you do the, normal sort of like, backside camera shot, that's not burst. I believe it comes up as a HEIC. it's like super weird. and it's very bizarre because now you're talking about, them adopting HEIC or HEIF being the only ones. And now. Providing support for webP, which is super interesting, but that may also have, to do with the fact that, there are patents around, HEIF, and HEIC, and that's something that I've, I've come to sort of, discover. And, and, and why the support for, open source formats like WebP, and a few of the others, like, AVIF that I talked about, are, are significant, because I think the opportunities are there to sort of bypass those royalty payments. Jeremy: [00:28:06] The current encodings that we use now are any of those, patent encumbered, like are the browser vendors paying royalties for those? Henri: [00:28:16] With respect to the WebP certainly not, None of the browsers are supporting HEIC. So there's probably no payments there that's part of the reasons why, I believe, some of the future, formats that are patented maybe challenged, they'll still make some money, but I don't think that they'll see the sort of windfalls that they have in the past. Just cause there's so much support behind, open source, formats, you know, I'll give you a quick example. Let me know if I'm going off course here because I have all this stuff racing through my head. so AV1 it's an open source video format and AVIF is the sort of, the image format that's, born from AV1. AV1 is being supported by two to three dozen companies, all companies that have a very vested interest in video. And, all the browser vendors are in because actually Chrome, Mozilla and Cisco were three of the founding partners in this. And Apple and Safari joined. Does that mean that Apple and Safari specifically is going to support AVIF? Not necessarily, but at least we see the early interest. and I don't see why Safari won't keep a very close eye on that now there are some people out there who would make the argument that they won't. Okay. I get it. but the fact that they joined the consortium very early, I think is, is, Hopefully telling, of their interests. So that being said, there will be support, I think, longterm for open source, formats. although, there's, there's one particular format, sort of like lurking in the background right now, which is called the JPEG XL. And this is another open source format a couple of years ago. when the JPEG was celebrating an anniversary, I think it was the 25th anniversary actually. the, JPEG.org, the organization, put out a call for paper, to sort of see what was out there. See if people were interested in, in, sort of like improving the JPEG, as it stood, because again, 25th anniversary, the JPEG was a little long in the tooth and it's like, what else can we do? so a couple of companies came together. Seven submissions were made actually. Two were picked. And the two companies that, were selected, were Cloudinary and Google. And Cloudinary had played around with this one format called FUIF, which is a free universal image format. And, Google had been toying around with this one format called PIK. PIK was in the background working because they had also believed that the JPEG was getting a little old and can use some updates. Long story short, the two, projects kind of came together into one and it became like the JPEG XL. and it's been moving along. you could actually, play with it right now. but. Again, not to get into the entails. it's not been adopted by a single browser yet, but they're certainly working on it. And in the fact that I think you have two image powerhouses, coming together, I think there may be something bubbling, on that end, and the JPEG is, being touted as like the one format for all your needs. So imagine, potentially, the WebP with the added idea and support that would also be able to replace something like an SVG, which is very interesting because the SVG as a vector format has particular features that a raster format can never have. But the JPEG XL feels like boom. You know, they have that covered. And a few other features that, you know, I don't want to get into the entrails too, too, too much, but, it's, it's kind of fun out there right now, you know? Jeremy: [00:32:16] It's interesting because we've had the same formats in the browser for such a long time. We've had the JPEG the GIF the PNG, SVG, seems after all these decades, we're finally getting to the point where we might start seeing, new, new formats take over. Henri: [00:32:36] Yep. And you know, people have to realize that, there's, you know, there are so many things that, that go into, having, so many updates say in the last three or four years or five, part of it's like computing power, there's so much, that goes into, being able to, have these formats readily available. Early days of, of the web P one of the challenges was the fact that it was CPU intensive. But you know, again, you're talking about the WebP being 11 years old in, you know, in six years, CPU power can change quite a lot from a handheld device to, to your laptop. Right. So who knows what you know is taking place on the enterprise side. but you know, it's funny, you should mention the fact that the formats are all old. In, in my talks, I mentioned this quite often and just want to remind people that, you know, like you said, the GIF, the SVG, the PNG and JPEG. You're looking at easily, like a hundred years, which is crazy, you know, like I, I joke around in my talks that it's like older than the Rolling Stones, but, that's very important. We've had HTTP1 and 1.1 For like almost 30 years almost. And, in the last three or four, we've gone H2 and now we're talking H3. If you're looking at the early days of web, you know, no one knew that the web was going to be consumed, in greatest amounts on handheld devices, and handheld devices with moderate power. I think we're lucky enough that, we have like some iPhones and high end Androids and whatnot, but the average individual is looking for a deal and the deals happen with moderate devices and they have particular, CPU hurdles and, you know, we've needed to make some changes along the way. Formats being one of them and, protocols being the other, but that's a separate story. Jeremy: [00:34:38] I would think that currently a lot of the devices have dedicated hardware to decode specific image formats, like for example, H264 and, and maybe now that things like web P are gonna be in the browser, there would be dedicated hardware to, to decode that as well. Henri: [00:34:56] And, and these are things that will probably come along. but you know, you still have the fact that you have an absolute trusty in what I like to call the workhorse in, in the JPEG that's always going to be there. I mean the JPEG again, by far, the, the, the best support out there, it's the one that's being delivered, by digital cameras by, you know, Our phones. So I mean, there's less of a concern, but eventually yes, you do want some hardware decoding. Like, for example, when I brought up the, the AV1 I remember, a speaker from, a talk from an engineer at YouTube, talking about, you know, them experimenting with the AV1 and then realizing that something about like 10 to 12% of their users run very old devices and they had no idea. And so these are things that you have to take in consideration. And so, if they're finding out that they're on old devices through, you know, trying to serve, you know, next generation video, They're still going to be on these old devices being served next generation, still images. These are, again, part of these curve balls that engineers are being, pitched, quite often Jeremy: [00:36:18] Yeah, so it seems like this, I guess, dream that we had of. Hey, everything is going to be moving to web P or maybe everything will be moving to AV1 is probably not going to become true for actually quite some time, because like you said, there are still going to be a significant percentage of people who they need to use the old formats because of their old devices or because they're, they're buying low cost devices. So it doesn't seem like we're going to be able to get away from. The image tag that goes here, we'll give you the JPEG. We'll give you the webp and so on. Yeah. Henri: [00:36:56] Yeah, that, that, that's definitely going to be, you know, it it's like that rough transitional period right where, everything's being supported by everything you needed. And then now you have to get into that transition where it's like, okay, well, you know, maybe it's the hardware that needs to change. Okay. Now we have to make sure that people know that they should request a web P and things like that. So along the way to nirvana. we have to get a bunch of things sorted out so that, from a user standpoint to a developer standpoint, to like, a hardware, so yeah, I mean, it it goes past browser support. and obviously that's one of the hurdles that's without a doubt. but, once you have the browser support, that you need, you know, you still have this sort of like the few legs to, go down and, and make sure that, you can have that sort of like perfect situation where it's like, okay, boom. Now we have like the support that we want. We have the browsers, and then we have the education and that, that, that needs to take place. Jeremy: [00:38:07] We've been talking a lot about things that are, that are coming and I want to bring us back to some of the things that that developers can do now to improve performance or at least improve, perceived performance. And one of the things we had mentioned earlier, was the concept of image, lazy loading. Like, should we be loading all the images on page load? Or should it be as the user scrolls? And I wonder from your perspective, how should web developers approach that and what are the tools they should use for that? Henri: [00:38:43] I would say you, you actually should be lazy loading. I mean, ideally in an ideal world, you know, you don't, request resources that you're not going to see. and, a while back, I don't know if I could dig this up, but there was a study, sort of indicating that, something like two thirds of, of, of resources were below the fold, on average, and then, on top of that, only about one out of two, users went to the bottom of the page. So yeah, so you ended up having a bunch of resources below the fold that, quite possibly aren't going to be, needed. So the advent of lazy loading came about again, you know, wanting to make sure that you're not hampered by a page having to load say 10 megs of resources, just so that you can look at the, first sort of like page and a half of, of information. So that being said, lazy loading became a bit of a priority. And as you may know, Chrome has natively, added, lazy loading as of right now, I believe in stable, if not for sure canary, and obviously there's some libraries out there that would, that would help you out with, with, that process as well. And again, I get back to the idea of being disciplined as a developer. And making sure that, Hey, did you want to snack because, I can give you a snack or I can take you to all you can eat, you know? And the all you can need is what we don't need since you only want a snack. So, if that analogy made sense, but, but yeah, I mean, it's, it's super important, you know? I mean, for example, I think I tweeted this a couple of days ago. someone who's at an agency, sent me a, a site that they had just, pushed and it'd gone live and I'm like, okay, let me take a quick look. You know, it was like 11 megs on, on first load. it was like 99 images. 89 of them were lostless, and everything loaded in one shot. And it was a fairly long, fairly long page. I mentioned this to them in a quick communication, like, hey, there's a bunch of other issues, but you guys should be using some lazy loading, you know, because again, it's the idea of whether or not, a, a individual is going to go to right down to the bottom of, the page, and, on average that's not the case. And lazy loading is going to help you manage those assets. for the best user experience possible, Jeremy: [00:41:08] So with that particular page, as an example, it sounds like one of the things that they should do is conditionally load. different qualities of images based on the device you're using. And the other thing would be to do some form of lazy loading so that you, you don't load every image on the page, but instead you load them as the user gets close to them. and this is all being done just through HTML tags? Like there's no JavaScript? Henri: [00:41:39] So barring the native, implementation. you could actually there are a couple of ways that you can set that up now, prior to the native implementation of lazy loading, we were using the intersection observer API, and what that was essentially, it was kind of like, I describe it as like fake lazy loading. so you could actually indicate, you actually set up, through intersection observer, sorta like how far from the viewport you want particular assets to load, and that became, native to the browsers before Lazy loading. Now it had uses beyond images, but, people were starting to use that for images specifically. That was certainly available now on, you know, outside of using JavaScript, I can't believe you were able to set that up. however, you had mentioned, potentially using, Images, I think you'd mentioned images in a, low quality. Did you mention that at some point? Jeremy: [00:42:48] Yeah. So if you are on, let's say you're on a phone and the device size is small. Maybe you don't need that full 4K resolution image. so it sounded like there was some way within just using HTML that you would be able to, select different images for different devices. Henri: [00:43:11] Oh, so, we're talking about maybe using the, network information API potentially, we're getting into, I guess, JavaScript, not so much HTML, but, you know, you could actually start to conditionally, provide, particular, image qualities. Depending on the network conditions, obviously now, what that could be is, I don't know, let's talk about baseball, you know, you may have a front page where it's like major league baseball under a crisis, you know, and then you have like a bunch of players, like, on the page in an image, but that might be under ideal conditions, say 4G powerful phone, whatever. But just say they're under less than ideal network conditions instead of, of, of having the image of the players. You might throw up, the MLB logo. Three colors, made it at SVG. It might be two kilobytes instead of like the 43 that it was for the image of the players. So the net info API is going to allow you to do, stuff like that. Certainly. there was a point as well, and this was again, a JavaScript implementation where, I mean, if you do, what sort of people know from medium, where they give you that sort of blurred, image, and then it does this little swap to give you something a bit more, high quality. I mean, people have their say about that. Some love it. Some don't because it's actually an extra request. Et cetera, et cetera, et cetera. Facebook also played, aggressively with that where they're actually, I think their, their, blurred image was like one or two kilobytes or something. And then they would swap in the proper image that was more of a user experience situation because they wanted to let people know that, Hey, this image here coming up. You can stick around, but you know, that sort of blurred image gave the user just the information they needed to know. Hey, there's an image, I can wait the extra, like half second for it to load up. Instead of giving you like this blank page. Jeremy: [00:45:19] This has all been specifically with images. Are there any other, common mistakes, either on that particular page or on other websites that you often see? Henri: [00:45:31] I mean, in terms of, additional sort of challenges, like some of the challenges you'll see, it's sort of like, a bigger user experience, issue. So in terms of, prioritizing what needs to load, on the first load. So above the fold, and these are things that you're pretty much going to see, Once the page loads, but at a particular level so that's why the film strips are very important. So you can see sort of like a frame by frame level of what's loading on the page, and then you can make some adjustments. I mean, in terms of, another low hanging fruit, I'd probably say, the next one might be making sure that you keep your requests down. And again, that partly has to do a bit with some lazy loading, because at that point, you can make sure that, you're not loading, like say, you know, making these 300 requests in one shot when really you can just keep it at like a hundred, 150, I dunno. But you'll know only when you see the page load yourself. I think that's certainly important. And I'm talking about low hanging fruits here. I don't want to get into the deep, entrails that, that, you know the good people like Andy Davies and Patrick Meenan, get into. but yeah, I mean, these, I think are some of the, immediate decisions that you can make. Just making sure that, you know, your, your first load is as quick as possible. And that's by keeping some of the requests down and make sure that you're not, you're not stuffing, that first load with like a bunch of, you know, I shouldn't say needless, resources, but some resources are probably, on the fence on whether or not they should be right there. you could sort of, you know, have them load below the fold, and still deliver the information that you, you, planned on, on providing. Jeremy: [00:47:20] In the case of the first load. Would that usually be because when you first load the page there's scripts that are running, or there's images that are loading, that aren't really content, I guess, that aren't really text somebody needs to read and things it's like that. And those are the things that you would load, later, somehow I guess this is what you're saying? Henri: [00:47:43] So, it's a great question because this is the kind of conversation I've had with, with, designers, and which is why I always believe that designers should certainly be, aware of the performance conversation, because you know, they'll sit there and make this, have this page mock up and they're like, boom, boom, boom, boom, boom, boom. This is going to look amazing. But really and truly, I think part of that push pull is what is the most important here? what. what kind of asset, what information can we have come in below the fold? You know, what, what can come in a little later in a load? because again, all that research, is around the snap of the finger, making sure that page loads up right away, the user can kind of scan to see what's going on. And then at that point start to like make the way down the page. You do not want that page to load and have this sort of like blank space. And then that turns into that blank stare. And then, who knows what happens after that? you really want that information to be like, boom, cattle prod. It's right there. It's like, okay, they're scanning the headline, looking at a couple of photos and then they start to load up the rest. And that little bit of time as seemingly insignificant is sometimes a world to a page and when it comes to loading resources, Jeremy: [00:49:09] So that's almost bringing it to, being a part of the design is can you make a page where. you don't have to load all the sort of surrounding images and surrounding chrome, just to see the content. And even before you talk about performance, it's, it's more of a, when somebody first gets to this page, how do we make sure they see what they need to see? Henri: [00:49:30] Absolutely. Absolutely. And, you know, you bring this up just a week or two or week after, I've I've listened to, or watched, an amazing talk. it was, actually, with regards to cnn.com, during 9/11, you know, and it was phenomenal how, the sort of tech lead, talked about how the stripped the, page bare in order to just keep the details that they needed, because they actually had tried one version of the page, what they felt was, sort of like the essentials. And then they kept stripping. It kept stripping, it kept stripping it, and eventually it just became one image, a logo and text. Now, granted, you know, that's also because you're under so much sort of, sort of stress from all the traffic, but the idea remains that what do you need to deliver in terms of information? What is going to load as quickly as possible? That's what we're going to go with. In essence, almost 20 years later, you know, we're still dealing with that because now, even though we don't get that sort of like 9/11 level stress to a site, but you still have the fact that you have devices and varying networks and you know, it'll never translate again from like, that kind of traffic to, you know, a network being that throttled. But the fact remains that, we can predict that they're going to be on their device. It's going to have various, a network connectivity, varying power as a handheld. And so you try and do the best to manage that element. And the fact that you do have like a team of developers and designers who have sometimes, dueling opinions. Jeremy: [00:51:27] As a user, and maybe this is more common amongst developers, we sometimes like to see the lite pages, you gave the example of CNN or NPR, where they just give you the content. And, those are often in separate versions from the normal site, but you're sort of saying, can we take some of the lessons that we're applying to those, those lite sites and just apply it to our design in general? Henri: [00:51:53] Absolutely. And, Someone who had heard me talk about the idea that lite sites existed. and I brought up the fact that lite.cnn.io, is upright now. And you can go check it out and it'll have the same information as cnn.com, but CNN.com will have all the media images and videos and, and X, Y, and Z and ads. Whereas the lite site is (pssh) text, not a single image. And apparently, this became a bit of a standard after 9/11 and that site is up all the time. And again, I keep mentioning 9/11 because obviously that was like a very unique situation, but the fact remains that in times of hurricanes, any kind of like meteorological, crisis, or anything like that. Storms, whatever you want. These sites are still going to be very important because that infrastructure is going to be down and you won't be able to access sites as, as readily as, as possible. and, the fact that you have a site with just bare bones info is fantastic. Jeremy: [00:53:00] Another thing I want to talk a little bit about is, Chrome being slow or being a memory hog is kind of this running joke. And I'm wondering in your perspective, is that the fault of Chrome or is it the types of applications and websites that developers are putting on it? Henri: [00:53:20] I mean, it, this is mildly above my pay grade. you know, I keep it a number of windows open as well. there's probably a little bit of both going on. and applications have never been more demanding that's without a doubt. and in turn that's a testimony to the fact that browsers have never been more capable with a bunch of features, you know? I've been a big proponent and fan of, browser engineers because, I should pull this up, but someone, on Twitter said that the browser is by far. the greatest piece of software out there. I think there's a strong case for that to be absolutely true. I've gone out and said that, on your phone, through your browser, you could probably please just take care of everything. You need to do banking. you can save money, you can go out and make money. I mentioned that you could find a date, buy clothes for that date, rent the car for the date, order food for the date, all through a browser, and if you had told us that like 10, 15 years ago, people would be like, yeah, whatever. But in 2020, today, July 19th, it's happening. And, browsers have to make these sort of, accommodations, you know, so, whether or not Chrome is to blame in terms of like being a memory hog, I'll let you know the engineers debate that, but I think you must tell the story that the browser is basically a bit of like a Swiss army knife right now. We have made demands, from the browser that have generally been met. And, I think we've benefited from that, 10 fold, and, and again, If you didn't have a laptop or anything like that, the browser on your device would be able to, to save you. And, you know, we could have had this, this conversation on the browser, on the phone, let alone that, you know, the fact that we actually are having it on a desktop right now, you know, so, you know, kudos to the engineers out there. I'm not slamming you guys. Jeremy: [00:55:31] Yeah, it's, it's pretty, it's pretty incredible just what you can do in the browser and the fact that it has become so extensive that now we have electron right. Where people are making websites basically designed to be desktop applications. Henri: [00:55:47] Yeah. I mean, I don't know if you saw someone, yesterday the day before. I think recreated, either MacOS 8 or some Mac application in electron. It was pretty fascinating, you know, so, but yeah, absolutely. Jeremy: [00:56:02] And what makes a lot of this possible is, is JavaScript, right? And there's a lot of different JavaScript frameworks that people use, like React and Vue and so on. And from your perspective, what are the additional things you should be thinking about from a performance perspective when you're working in, in heavy JavaScript, code bases? Henri: [00:56:25] You know, on demand. That's it, you mean the world without JavaScript although it can, it can exist, you know, it would be challenging. and I think we have to accept that it's definitely here to stay and it's here to, I mean, I shouldn't say here it's here to stay. I don't think it was ever not invited. I just think that, there was a sort of like this liberal use of JS and, you know, without really under standing how powerful it was and, how caustic it could be a to the experience, the user experience, Thank the Lord for people like Alex Russell, who are out there to remind us of the fact. But that that's, that's really it. And I think that, as, more and more frameworks come around, the idea is to not really. shun, their availability, because again, you know, people are spending some time to research and create a library, or a framework that they feel that they need. It's when it's being used and deployed. can we use it as efficiently as possible? That's it at the end of the day, name the resource. Someone's out there, using it liberally. Can you make sure that, in employing this, this resource, you could just, send it down the wire. Or make use of it in the most disciplined fashion. And that's what it comes down to in the end. And that is the research that needs to take place. when, using these resources, especially JavaScript, like I said, and like you mentioned, between its availability and everything that it can do for us, Jeremy: [00:58:05] You, you talk about being in dev tools all the time. Are there specific parts of dev tools or, or tools outside of dev tools that you use to try and identify areas where there's performance problems? Henri: [00:58:20] So, Dev tools I mean, auditing a site can seem a little boring at times because. You are essentially looking for very, specific details, I personally love dev tools because, as scary as it can look at times, and there's still many parts of it that, you know, I tend to forget, or I get lost, in, but, It will provide a lot of the information that you need, on a pages, health, and, and what's taking place, on the page now with respect to tools in particular that I like, again, I'll say it, I love dev tools. Obviously, it would be impossible to, do a, proper performance audit without having something like web page test handy, webpagetest.org, a product of, the great mind of Patrick Meenan a tool that's been called the, the Cadillac of performance. And, and again, not the prettiest to the naked eye, but, a treasure trove of, of detail and information. And, Patrick Meenan has done, incredible work in, in making that available. so yeah, webpagetest.org is certainly, something, you need to keep around. I'm going to throw in, lighthouse. And I say that specifically, because, if you take lighthouse version one, two, right through to version six, there's been very considerable amount of, of work and research that has gone into what we are seeing right now in lighthouse, in terms of the information that you provided, the recommendations, and, the links they provide with the recommendations, they'll tell you, it's like, Hey, we see. In these sort of like 10 resources that you might be able to save X amount of, of kilobytes of megabytes, even, if you do this, and that has helped in what I believe, is yeah, a maturation in, understanding performance. And what you're also seeing right now, through lighthouse is people working, to reach the, mythical score of a hundred. you know, you have people sharing. Their performance scores and all the other audits that, it takes in consideration, sharing them on Twitter, you know, saying that, you know, I have 89 right now. I'm totally working on the rest. This was not happening like prior to lighthouse. you know, having a, performance conversation was kind of like speaking to like a wall of bricks, but now, people are just, openly and willingly sharing, you know, their scores, which means that they are looking into ways of improving, their sites performance, whether or not they get into the deep entrails, separate story. But, Google has been able to create this, platform where, Not all the low hanging fruits, but low to mid will sort of give you a good idea of what it's like to look after performance. So, I would say certainly dev tools, certainly webpage And, and again, I think lighthouse has been, very important in sort of raising the bar of performance. Jeremy: [01:01:48] Yeah, I think lighthouse is interesting because it almost gamifies it. You get this score, like you said, you can post it in a tweet and say like, Hey, look at, look at the score I got. There's also a target, for people who make frameworks and things like that. Where, if you have a static site generator, you could say, Oh, the default template that you get from my generator, it's going to get you this score in lighthouse. So I think it's, it is very powerful to have that, that target or that goal that everybody can see. Yeah. Henri: [01:02:23] Yup. Absolutely. And, it has been gamified and, I know that term is used, quite often. and I definitely agree. I mean, I don't want to make it sound like it's. Like a game game game. but it is something that people have been able to sort of like, you know, look at and say, okay, this is where I want to be. You know, like this is the 10 second, hundred meter. That's how I qualify. I've been running 10 fives. We're going to get it down to 10 soon. So, I, I'm seeing these, these conversations come out of, of individuals. I least expected to sort of, talk about, when that kind of adoption, takes place, I think it needs to be sort of, discussed and to an extent, applauded. Jeremy: [01:03:07] I think in, in your work, it seems like you have a pulse on a lot of the different APIs that are available in the browser. Are there any that you think are being under utilized things that people could be using that they aren't? Henri: [01:03:24] The thing about some of these API APIs, quite a few, are likely being underutilized, but what is underutilized ultimately? I think the bigger picture is, can you look at your personal goals with this particular site? are they being met and if they're not, what are the avenues. To you meeting this goal that you have in mind. and then you'll step back and look at all your options. And, you know, for every particular API that's available there's probably a bit of a downside, some of it might be sort of like complexity of use, you know, cause some are just not used quite often because it requires a bit more work and sometimes people are like, Oh, I don't feel like spending an afternoon having to do this. You know, I get back to the idea around animations, right. it could be the animated GIF. It could be potentially an MP4 it could actually be. animated CSS, but that CSS, the SVG animation is going to take a bit more work and you're like, eh, or are you going to just jump in and just say, Oh, F it, I'm just going to use this GIF for this animation, even though by the naked eye, you could tell that it could have been done in CSS, right? So the same thing is happening with particular APIs where it's like, eh, using it on a page. It's probably going to take a little bit of work, which means if there's a mistake, it's like, I don't want to go in there and having to do all this, like, and debugging X, Y, and Z. You start to look at some of these other options that might be a little easier. And then that might even mean that you might rewrite a particular part of the page cause it's like, eh, you've suddenly discovered that, you know, you could kind of strip this out and make it a bit more bare and make it easier overall to sort of manage. so, that's part of the challenge with APIs. In fact, I remember, I think it was Steve Souders who said that user timing API. So, yeah, it was the user timing, API, a great tool. developer adoption, not so great because it just meant that there was a bit more work taking place. And a lot of developers, they found didn't want to get into that. And so that's why you start to have some of these other tools like speed curve. That'll sort of like show you what needs to be improved and where you can sort of, make some of these improvements. and hopefully, without the sort of, challenges of having to sort of like refactor, some code, when most developers don't want to. Jeremy: [01:05:58] Right. I think it's sort of a general belief that you only do the things that you need to, and, if you have a simpler way of doing it, where you import some package from NPM, let some JavaScript library, do it for you, then you'll do it until it becomes a problem. Henri: [01:06:15] Yep. Absolutely. And that's where, you know, the whole idea of, I'm sure you remember this sort of a sudden push for a vanilla JavaScript because people were just jumping in and grabbing all these libraries to do some of the simplest, work and people importing in particular library of, you know, like a hundred, 200k library could have been like 30 K of just vanilla JavaScript. You know, so, as you said people just don't like to do the heavy lifting Jeremy: [01:06:42] Yeah, I mean, I think we can all identify with, if we don't have to do the work, then we don't want to do the work. Henri: [01:06:49] And that's it. I mean, and I never fault developers for that. That's where we're at, you know, trying to make sure that people can sort of do a bit of the work and get some the reward without going heavily into the entrails. Jeremy: [01:07:02] For sure. I think that's a good place to start wrapping up, but are there any other things you, you think I should have asked or anything you thought we should have talked about? Henri: [01:07:13] I mean, not really. It's just, you know, again, this is a, it's just a fun conversation and, hopefully not to the audiences detriment sometimes, you know, my mind starts to run in these different directions the minute, make a one statement and I'm like, Oh, oh, oh. I kind of want to mention this too. I mean, we did spend a bit of time on the images side, but, it, it is something that I've found fascinating, over the last few years. And, you know, I've followed quite a few, developers and engineers who have, been deep into that research. So anyone who wasn't into images, I certainly do want to apologize. But, if you are, I hope you enjoyed it and enjoyed that little, the triggering from my, my Bluetooth. That's telling me that, dude, it's kinda like Oh man, what's his name? the comedian he had the wrap it up box. Jeremy: [01:08:03] Oh, really? No, I hadn't heard of that. Henri: [01:08:06] Oh man. why am I forgetting his name? Oh, anyways, whatever. Jeremy: [01:08:10] Yeah, it's trying to play you off the stage. Henri: [01:08:12] Exactly, exactly. It's like the big hook keeps missing my neck. But you know, again, there's so many things around performance that that, can be discussed. You know, I recently listened to a podcast, which featured an engineer from Facebook and they were talking for an hour about, HTTP3 and QUIC and I thought that was fascinating. And, and that's some of the heavy duty lifting that's taking place, But. For very obvious reasons. And it's certainly going to play well into the future, especially, this future that we have right now, that's going to be heavily laden with media, you know, images and certainly video. For obvious reasons who are working from home now, students are going to be learning from home. So there's going to be a lot of streaming taking place. And that's the, the next hurdle in dealing with media management and dealing with performance. Jeremy: [01:09:02] Yeah, I think my hope as a developer is that these new things, whether they be HTTP3, or they be the new video codecs new formats, I'm hoping that we get APIs or get supporting libraries that makes it almost transparent to us. That makes it so that, I just say, I want to post this video and somebody else, like probably the people who are on that podcast have done the hard work of making sure that goes over the right transport. Gets encoded in the right video format, split up if it needs to get split up that sort of thing. that's, that's my hope anyways. Henri: [01:09:43] Yeah. I mean, it's certainly everything that we'd like to see, because, as it becomes a seamless and like you said, effortless for us, ideally, that can be sort of, also, transmitted down to, the user. I mean, because ultimately they're the ones, having to, you know, follow along, the online class. They're the ones who have to do the, live, remote, you know, yoga, you know, workshop or whatever it is. I think there are a lot of discoveries that are going to take place in the next little while, because, the, the fact that our world is going to be, about us consuming a lot more video content. This was predicted to be happening two to three years from now. Obviously this pandemic, basically, cut that into like a third, we are more than ever dependent on the transmission of video across the net. And I think it's incredible that even our conversation is taking place right now in such a clear and seamless fashion, back to school, is about like a month from now, we're about to see the net just they're going to be pulling carts, you know? And, it's very interesting to see how that's going to take place because you know, we're talking about the pandemic hitting like mid-March and there was still confusion as to how this was going to play off of the rest of the year. Now it's like, we're getting ready for what might be an entire year of remote everything. let's see what happens. Jeremy: [01:11:16] For sure. This is another big test for the web. Henri: [01:11:21] Yep. Totally. You know, the web is just there, like wiping its forehead, like, Oh my God, like what just happened? And how am I going to make it, make my way through this? So, no man, it's, it's super interesting and I'm sure we're going to, you know, maybe a year from now have another discovery, and some engineering feat that's hopefully going to make things better for everyone involved. Jeremy: [01:11:46] For sure. If people want to see what you're doing next, I know you do a lot of conference talks and things like that. Where should they check you out? Henri: [01:11:55] I would say exclusively, Twitter, you know, I'm a big fan of the platform, so, Henri Helvetica. So H E N R I and Helvetica, as you know it to be spelled. I'm planning to have this 1.0, 2.0, release of work that I've wanted to do probably in September. So, I've shared this privately with a few people but, a blog is coming. I'm probably going to do a series of short YouTube videos talking about little things from browsers to a short podcast on performance, but a light conversation again, as we just did, but probably a little shorter. But yeah, Twitter is certainly the place to find me, whether it be, in public or in my DMs as well. Jeremy: [01:12:44] Awesome. I'm looking forward to seeing the YouTube channel and the podcast. Henri: [01:12:49] Absolument and Jeremy, man, I definitely want to thank you for A) Your patience. Because like I said, my AV hurdles, weren't the most fun. So this is a conversation that's long in the making and thanks for having me on the show. Jeremy: [01:13:04] Yeah, thanks for coming on. It's been a real pleasure Henri. Henri: [01:13:07] Merci Merci. And when I get my thing going, I'll be sure to ping you to have you as a guest. Cause I'll tell you about this other little project I have as well. Jeremy: [01:13:17] Nice, I'm looking forward to it. Henri: [01:13:19] Yeah, man, it's going to be fun. Jeremy: [01:13:20] That's going to do it for this episode. If you're interested in learning more about web performance, we have all the tools and APIs we discussed in this episode, in the show notes. And if you have questions, I'm sure that Henri would love to hear from you. The music in this episode was by Crystal Cola. Thanks again for listening and I'll see you next time.

Get the Snipd
podcast app

Unlock the knowledge in podcasts with the podcast player of the future.
App store bannerPlay store banner

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode

Save any
moment

Hear something you like? Tap your headphones to save it with AI-generated key takeaways

Share
& Export

Send highlights to Twitter, WhatsApp or export them to Notion, Readwise & more

AI-powered
podcast player

Listen to all your favourite podcasts with AI-powered features

Discover
highlights

Listen to the best highlights from the podcasts you love and dive into the full episode