Software Sessions

Jeremy Jung

Practical conversations about software development.

Episodes

Mentioned books

Nov 17, 2023 • 49min

David Copeland on Medium Sized Decisions (RubyConf 2023)

David was the chief software architect and director of engineering at Stitch Fix. He's also the author of a number of books including Sustainable Web Development with Ruby on Rails and most recently Ruby on Rails Background Jobs with Sidekiq. He talks about how he made decisions while working with a medium sized team (~200 developers) at Stitch Fix. The audio quality for the first 19 minutes is not great but the correct microphones turn on right after that. Recorded at RubyConf 2023 in San Diego. A few topics covered: Ruby's origins at Stitch Fix Thoughts on Go Choosing technology and cloud services Moving off heroku Building a platform team Where Ruby and Rails fit in today The role of books and how different people learn Large Language Model's effects on technical content Related Links David's Blog Mastodon Transcript You can help correct transcripts on GitHub. Intro [00:00:00] Jeremy: Today. I want to share another conversation from RubyConf San Diego. This time it's with David Copeland. He was a chief software architect and director of engineering at stitch fix. And at the start of the conversation, you're going to hear about why he decided to write the book, sustainable web development with Ruby on rails. Unfortunately, you're also going to notice the sound quality isn't too good. We had some technical difficulties. But once you hit the 20 minute mark of the recording, the mics are going to kick in. It's going to sound way better. So I hope you stick with it. Enjoy. Ruby at Stitch Fix [00:00:35] David: Stitch Fix was a Rails shop. I had done a lot of Rails and learned a lot of things that worked and didn't work, at least in that situation. And so I started writing them down and I was like, I should probably make this more than just a document that I keep, you know, privately on my computer. Uh, so that's, you know, kind of, kind of where the genesis of that came from and just tried to, write everything down that I thought what worked, what didn't work. Uh, if you're in a situation like me. Working on a product, with a medium sized, uh, team, then I think the lessons in there will be useful, at least some of them. Um, and I've been trying to keep it up over, over the years. I think the first version came out a couple years ago, so I've been trying to make sure it's always up to date with the latest stuff and, and Rails and based on my experience and all that. [00:01:20] Jeremy: So it's interesting that you mention, medium sized team because, during the, the keynote, just a few moments ago, Matz the creator of Ruby was talking about how like, Oh, Rails is really suitable for this, this one person team, right? Small, small team. And, uh, he was like, you're not Google. So like, don't worry about, right. Can you scale to that level? Yeah. Um, and, and I wonder like when you talk about medium size or medium scale, like what are, what are we talking? [00:01:49] David: I think probably under 200 developers, I would say. because when I left Stitch Fix, it was closing in on that number of developers. And so it becomes, you know, hard to... You can kind of know who everybody is, or at least the names sound familiar of everybody. But beyond that, it's just, it's just really hard. But a lot of it was like, I don't have experience at like a thousand developer company. I have no idea what that's like, but I definitely know that Rails can work for like... 200 ish people how you can make it work basically. yeah. [00:02:21] Jeremy: The decision to use Rails, I'm assuming that was made before you joined? [00:02:26] David: Yeah, the, um, the CTO of Stitch Fix, he had come in to clean up a mess made by contractors, as often happens. They had used Django, which is like the Python version of Rails. And he, the CTO, he was more familiar with Rails. So the first two developers he hired, also familiar with Rails. There wasn't a lot to maintain with the Django app, so they were like, let's just start fresh, fresh with Rails. yeah, but it's funny because a lot of the code in that Rails app was, like, transliterated from Python. So you could, it would, it looked like the strangest Ruby code in the world because it was basically, there was no test. So they were like, let's just write the Ruby version of this Python just so we know it works. but obviously that didn't, didn't last forever, so. [00:03:07] Jeremy: So, so what's an example of a, of a tell? Where you're looking at the code and you're like, oh, this is clearly, it came from Python. [00:03:15] David: You'd see like, very, very explicit, right? Like Python, there's a lot of like single line things. very like, this sounds like a dig, but it's very simple looking code. Like, like I don't know Python, but I was able to change this Django app. And I had to, I could look at it and you can figure out immediately how it works. Cause there's. Not much to it. There's nothing fancy. So, like, this, this Ruby code, there was nothing fancy. You'd be like, well, maybe they should have memoized that, or maybe they should have taken that into another class, or you could have done this with a hash or something like that. So there was, like, none of that. It was just, like, really basic, plain code like you would see in any beginning programming language kind of thing. Which is at least nice. You can understand it. but you probably wouldn't have written it that way at first in Ruby. Thoughts on Go [00:04:05] Jeremy: Yeah, that's, that's interesting because, uh, people sometimes talk about the Go programming language and how it looks, I don't know if simple is the right word, but it's something where you look at the code and even if you don't necessarily understand Go, it's relatively straightforward. Yeah. I wonder what your thoughts are on that being a strength versus that being, like, [00:04:25] David: Yeah, so at Stitch Fix at one point we had a pro, we were moving off of Heroku and we were going to, basically build a deployment platform using ECS on AWS. And so the deployment platform was a Rails app and we built a command line tool using Ruby. And it was fine, but it was a very complicated command line tool and it was very slow. And so one of the developers was like, I'm going to rewrite it in Go. I was like, ugh, you know, because I just was not a big fan. So he rewrote it in Go. It was a bazillion times faster. And then I was like, okay, I'm going to add, I'll add a feature to it. It was extremely easy. Like, it's just like what you said. I looked at it, like, I don't know anything about Go. I know what is happening here. I can copy and paste this and change things and make it work for what I want to do. And it did work. And it was, it was pretty easy. so there's that, I mean, aesthetically it's pretty ugly and it's, I, I. I can't really defend that as a real reason to not use it, but it is kind of gross. I did do Go, I did a small project in Go after Stitch Fix, and there's this vibe in Go about like, don't create abstractions. I don't know where I got that from, but every Go I look at, I'm like we should make an abstraction for this, but it's just not the vibe. They just don't like doing that. They like it all written out. And I see the value because you can look at the code and know what it does and you don't have to chase abstractions anywhere. But. I felt like I was copying and pasting a lot of, a lot of things. Um, so I don't know. I mean, the, the team at Stitch Fix that did this like command line app in go, they're the platform team. And so their job isn't to write like web apps all day, every day. There's kind of in and out of all kinds of things. They have to try to figure out something that they don't understand quickly to debug a problem. And so I can see the value of something like go if that's your job, right? You want to go in and see what the issue is. Figure it out and be done and you're not going to necessarily develop deep expertise and whatever that thing is that you're kind of jumping into. Day to day though, I don't know. I think it would make me kind of sad. (laughs) [00:06:18] Jeremy: So, so when you say it would make you kind of sad, I mean, what, what about it? Is it, I mean, you mentioned that there's a lot of copy and pasting, so maybe there's code duplication, but are there specific things where you're like, oh, I just don't? [00:06:31] David: Yeah, so I had done a lot of Java in my past life and it felt very much like that. Where like, like the Go library for making an HTTP call for like, I want to call some web service. It's got every feature you could ever want. Everything is tweakable. You can really, you can see why it's designed that way. To dial in some performance issue or solve some really esoteric thing. It's there. But the problem is if you just want to get an JSON, it's just like huge production. And I felt like that's all I really want to do and it's just not making it very easy. And it just felt very, very cumbersome. I think that having to declare types also is a little bit of a weird mindset because, I mean, I like to make types in Ruby, I like to make classes, but I also like to just use hashes and stuff to figure it out. And then maybe I'll make a class if I figure it out, but Go, you can't. You have to have a class, you have to have a type, you have to think all that ahead of time, and it just, I'm not used to working that way, so it felt, I mean, I guess I could get used to it, but I just didn't warm up to that sort of style of working, so it just felt like I was just kind of fighting with the vibe of the language, kind of. Yeah, [00:07:40] Jeremy: so it's more of the vibe or the feel where you're writing it and you're like this seems a little too... Explicit. I feel like I have to be too verbose. It just doesn't feel natural for me to write this. [00:07:53] David: Right, it's not optimized for what in my mind is the obvious case. And maybe that's not the obvious case for the people that write Go programs. But for me, like, I just want to like get this endpoint and get the JSON back as a map. Not any easier than any other case, right? Whereas like in Ruby, right? And you can, I think if you include net HTTP, you can just type get. And it will just return whatever that is. Like, that's amazing. It's optimized for what I think is a very common use case. So it makes me feel really productive. It makes me feel pretty good. And if that doesn't work out long term, I can always use something more complicated. But I'm not required to dig into the NetHttp library just to do what in my mind is something very simple. [00:08:37] Jeremy: Yeah, I think that's something I've noticed myself in working with Ruby. I mean, you have the standard library that's very... Comprehensive and the API surface is such that, like you said there, when you're trying to do common tasks, a lot of times they have a call you make and it kind of does the thing you expected or hoped for. [00:08:56] David: Yeah, yeah. It's kind of, I mean, it's that whole optimized for programmer happiness thing. Like it does. That is the vibe of Ruby and it seems like that is still the way things are. And, you know, I, I suppose if I had a different mindset, I mean, because I work with developers who did not like using Ruby or Rails. They loved using Go or Java. And I, I guess there's probably some psychological analysis we could do about their background and history and mindset that makes that make sense. But, to me, I don't know. It's, it's nice when it's pleasant. And Ruby seems pleasant. (laughs) Choosing Technology [00:09:27] Jeremy: as a... Software Architect, or as a CTO, when, when you're choosing technology, what are some of the things you look at in terms of, you know? [00:09:38] David: Yeah, I mean, I think, like, it's a weird criteria, but I think what is something that the team is capable of executing with? Because, like, most, right, most programming languages all kind of do the same thing. Like, you can kind of get most stuff done in most common popular programming languages. So, it's probably not... It's not true that if you pick the wrong language, you can't build the app. Like, that's probably not really the case. At least for like a web app or something. so it's more like, what is the team that's here to do it? What are they comfortable and capable of doing? I worked on a project with... It was a mix of like junior engineers who knew JavaScript, and then some senior engineers from Google. And for whatever reason someone had chosen a Rails app and none of them were comfortable or really yet competent with doing Ruby on Rails and they just all hated it and like it didn't work very well. Um, and so even though, yes, Rails is a good choice for doing stuff for that team at that moment. Not a good choice. Right. So I think you have to go in and like, what, what are we going to be able to execute on so that when the business wants us to do something, we just do it. And we don't complain and we don't say, Oh, well we can't because this technology that we chose, blah, blah, blah. Like you don't ever want to say that if possible. So I think that's. That's kind of the, the top thing. I think second would be how widely supported is it? Like you don't want to be the cutting edge user that's finding all the bugs in something really. Like you want to use something that's stable. Postgres, MySQL, like those work, those are fine. The bugs have been sorted out for most common use cases. Some super fancy edge database, I don't know if I'd want to be doing, doing that you know? Choosing cloud services [00:11:15] Jeremy: How do you feel about the cloud specific services and databases? Like are you comfortable saying like, oh, I'm going to use... Google Cloud, BigQuery. Yeah. [00:11:27] David: That sort of thing. I think it would kind of fall under the same criteria that I was just, just saying like, so with AWS it's interesting 'cause when we moved from Heroku to AWS by EC2 RDS, their database thing, uh, S3, those have been around for years, probably those are gonna work, but they always introduce new things. Like we, we use RabbitMQ and AWS came out with. Some, I forget what it was, it was a queuing service similar to Rabbit. We were like, Oh, maybe we should switch to that. But it was clear that they weren't really ready to support it. So. Yeah, so we didn't, we didn't switch to that. So I, you gotta try to read the tea leaves of the provider to see are they committed to, to supporting this thing or is this there to get some enterprise client to move into the cloud. And then the idea is to move off of that transitional thing into what they do support. And it's hard to get a clear answer from them too. So it takes a little bit of research to figure out, Are they going to support this or not? Because that's what you don't want. To move everything into some very proprietary cloud system and have them sunset it and say, Oh yeah, now you've got to switch again. Uh, that kind of sucks. So, it's a little trickier. [00:12:41] Jeremy: And what kind of questions or research do you do? Is it purely a function of this thing has existed for X number of years so I feel okay? [00:12:52] David: I mean, it's kind of similar to looking at like some gem you're going to add to your project, right? So you'll, you'll look at how often does it change? Is it being updated? Uh, what is the documentation? Does it look like someone really cared about the documentation? Does the documentation look updated? Are there issues with it that are being addressed or, or not? Um, so those are good signals. I think, talking to other practitioners too can be good. Like if you've got someone who's experienced. You can say, hey, do you know anybody back channeling through, like, everybody knows somebody that works at AWS, you can probably try to get something there. at Stitch Fix, we had an enterprise support contract, and so your account manager will sometimes give you good information if you ask. Again, it's a, they're not going to come out and say, don't use this product that we have, but they might communicate that in a subtle way. So you have to triangulate from all these sources to try to. to try to figure out what, what you want to do. [00:13:50] Jeremy: Yeah, it kind of makes me wish that there was a, a site like, maybe not quite like, can I use, right? Can I use, you can see like, oh, can I use this in my browser? Is there, uh, like an AWS or a Google Cloud? Can I trust this? Can I trust this? Yeah. Is this, is this solid or not? [00:14:04] David: Right, totally. It's like, there's that, that site where you, it has all the Apple products and it says whether or not you should buy it because one may or may not be coming out or they may be getting rid of it. Like, yeah, that would... For cloud services, that would be, that would be nice. [00:14:16] Jeremy: Yeah, yeah. That's like the Mac Buyer's Guide. And then we, we need the, uh, the technology. Yeah. Maybe not buyers. Cloud Provider Buyer's Guide, yeah. I guess we are buyers. [00:14:25] David: Yeah, yeah, totally, totally. [00:14:27] Jeremy: it's interesting that you, you mentioned how you want to see that, okay, this thing is mature. I think it's going to stick around because, I, interviewed, someone who worked on, I believe it was the CloudWatch team. Okay. Daniel Vassalo, yeah. so he left AWS, uh, after I think about 10 years, and then he wrote a book called, uh, The Good Parts of AWS. Oh! And, if you read his book, most of the services he says to use are the ones that are, like, old. Yeah. He's, he's basically saying, like, S3, you know you're good. Yeah. Right? but then all these, if you look at the AWS webpage, they have who knows, I don't know how many hundreds of services. Yeah. He's, he's kind of like I worked there and I would not use, you know, all these new services. 'cause I myself, I don't trust [00:15:14] David: it yet. Right. And so, and they're working there? Yeah, they're working there. Yeah. No. One of the VPs at Stitch Fix had worked on Google Cloud and so when we were doing this transition from Heroku, he was like, we are not using Google Cloud. I was like, really? He's like AWS is far ahead of the game. Do not use Google Cloud. I was like, all right, I don't need any more info. You work there. You said don't. I'm gonna believe you. So [00:15:36] Jeremy: what, what was his did he have like a core point? [00:15:39] David: Um, so he never really had anything bad to say about Google per se. Like I think he enjoyed his time there and I think he thought highly of who he worked with and what he worked on and that sort of thing. But his, where he was coming from was like AWS was so far ahead. of Google on anything that we would use, he was like, there's, there's really no advantage to, to doing it. AWS is a known quantity, right? it's probably still the case. It's like, you know, you've heard the nobody ever got fired for using IBM or using Microsoft or whatever the thing is. Like, I think that's, that was kind of the vibe. And he was like, moving all of our infrastructure right before we're going to go public. This is a serious business. We should just use something that we know will work. And he was like, I know this will work. I'm not confident about. Google, uh, for our use case. So we shouldn't, we shouldn't risk it. So I was like, okay, I trust you because I didn't know anything about any of that stuff at the time. I knew Heroku and that was it. So, yeah. [00:16:34] Jeremy: I don't know if it's good or bad, but like you said, AWS seems to be the default choice. Yeah. And I mean, there's people who use Azure. I assume it's mostly primarily Microsoft. Yeah. And then there's Google Cloud. It's not really clear why you would pick it, unless there was a specific service or something that only they had. [00:16:55] David: Yeah, yeah. Or you're invested in Google, you know, you want to keep everything there. I mean, I don't know. I haven't really been at that level to make that kind of decision, and I would probably choose AWS for the reasons discussed, but, yeah. Moving off Heroku [00:17:10] Jeremy: And then, so at Stitch Fix, you said you moved off of Heroku [00:17:16] David: yeah. Yeah, so we were heavy into Heroku. I think that we were told that at one point we had the biggest Heroku Postgres database on their platform. Not a good place to be, right? You never want to be the biggest customer person, usually. but the problem we were facing was essentially we were going to go public. And to do that, you're under all the scrutiny. about many things, including the IT systems and the security around there. So, like, by default, a Postgres, a Heroku Postgres database is, like, on the internet. It's only secured by the password. all their services are on the internet. So, not, not ideal. they were developing their private cloud service at that time. And so that would have given us, in theory, on paper, it would have solved all of our problems. And we liked Heroku and we liked the developer experience. It was great. but... Heroku private spaces, it was still early. There's a lot of limitations that when they explained why those limitations, they were reasonable. And if we had. started from scratch on Heroku Private Spaces. It probably would have worked great, but we hadn't. So we just couldn't make it work. So we were like, okay, we're going to have to move to AWS so that everything can be basically off the internet. Like our public website needs to be on the internet and that's kind of it. So we need to, so that's basically was the, was the impetus for that. but it's too bad because I love Heroku. It was great. I mean, they were, they were a great partner. They were great. I think if Stitch Fix had started life a year later, Private Spaces. Now it's, it's, it's way different than it was then. Cause it's been, it's a mature product now, so we could have easily done that, but you know, the timing didn't work out, unfortunately. [00:18:50] Jeremy: And that was a compliance thing to, [00:18:53] David: Yeah. And compliance is weird cause they don't tell you what to do, but they give you some parameters that you need to meet. And so one of them is like how you control access. So, so going public, the compliance is around the financial data and. Ensuring that the financial data is accurate. So a lot of the systems at Stichfix were storing the financial data. We, you know, the warehouse management system was custom made. Uh, all the credit card processing was all done, like it was all in some databases that we had running in Heroku. And so those needed to be subject to stricter security than we could achieve with just a single password that we just had to remember to rotate when someone like left the team. So that was, you know, the kind of, the kind of impetus for, for all of that. [00:19:35] Jeremy: when you were using Heroku, Salesforce would have already owned it then. Did you, did you get any sense that you weren't really sure about the future of the platform while you're on it or, [00:19:45] David: At that time, no, it seemed like they were still innovating. So like, Heroku has a Redis product now. They didn't at the time we wish that they did. They told us they're working on it, but it wasn't ready. We didn't like using the third parties. Kafka was not a thing. We very much were interested in that. We would have totally used it if it was there. So they were still. Like doing bigger innovations then, then it seems like they are now. I don't know. It's weird. Like they're still there. They still make money, I assume for Salesforce. So it doesn't feel like they're going away, but they're not innovating at the pace that they were kind of back in the day. [00:20:20] Jeremy: it used to feel like when somebody's asking, I want to host a Rails app. Then you would say like, well, use Heroku because it's basically the easiest to get started. It's a known quantity and it's, it's expensive, but, it seemed for, for most people, it was worth it. and then now if I talk to people, it's like. Not what people suggest anymore. [00:20:40] David: Yeah, because there's, there's actual competitors. It's crazy to me that there was no competitors for years, and now there's like, Render and Fly. io seem to be the two popular alternatives. Um, I doubt they're any cheaper, honestly, but... You get a sense, right, that they're still innovating, still building those platforms, and they can build with, you know, all of the knowledge of what has come before them, and do things differently that might, that might help. So, I still use Heroku for personal things just because I know it, and I, you know, sometimes you don't feel like learning a new thing when you just want to get something done, but, yeah, I, I don't know if we were starting again, I don't know, maybe I'd look into those things. They, they seem like they're getting pretty mature and. Heroku's resting on its laurels, still. [00:21:26] Jeremy: I guess I never quite the mindset, right? Where you You have a platform that's doing really well and people really like it and you acquire it and then it just It seems like you would want to keep it rolling, right? (laughs) [00:21:38] David: Yeah, it's, it is wild, I mean, I guess... Why did you, what was Salesforce thinking they were going to get? Uh, who knows maybe the person at Salesforce that really wanted to purchase it isn't there. And so no one at Salesforce cares about it. I mean, there's all these weird company politics that like, who knows what's going on and you could speculate. all day. What's interesting is like, there's definitely some people in the Ruby community who work there and still are working there. And that's like a little bit of a canary for me. I'm like, all right, well, if that person's still working there, that person seems like they're on the level and, and, and, and seems pretty good. They're still working there. It, it's gotta be still a cool place to be or still doing something, something good. But, yeah, I don't know. I would, I would love to know what was going on in all the Salesforce meetings about acquiring that, how to manage it. What are their plans for it? I would love to know that stuff. [00:22:29] Jeremy: maybe you had some experience with this at Stitch Fix But I've heard with Heroku some of their support staff at least in the past they would, to some extent, actually help you troubleshoot, like, what's going on with your app. Like, if your app is, like, using a whole bunch of memory, and you're out of memory, um, they would actually kind of look into that, for you, which is interesting, because it's like, that's almost like a services thing than it is just a platform. [00:22:50] David: Yeah. I mean, they, their support, you would get, you would get escalated to like an engineer sometimes, like who worked on that stuff and they would help figure out what the problem was. Like you got the sense that everybody there really wanted the platform to be good and that they were all sort of motivated to make sure that everybody. You know, did well and used the platform. And they also were good at, like a thing that trips everybody up about Heroku is that your app restarts every day. And if you don't know anything about anything, you might think that is stupid. Why, why would I want that? That's annoying. And I definitely went through that and I complained to them a lot. And I'm like, if you only could not restart. And they very patiently and politely explained to me why that it needed to do that, they weren't going to remove that, and how to think about my app given that reality, right? Which is great because like, what company does that, right? From the engineers that are working on it, like No, nobody does that. So, yeah, no, I haven't escalated anything to support at Heroku in quite some time, so I don't know if it's still like that. I hope it is, but I'm not really, not really sure. Building a platform team [00:23:55] Jeremy: Yeah, that, uh, that reminds me a little bit of, I think it's Rackspace? There's, there's, like, another hosting provider that was pretty popular before, and they... Used to be famous for that type of support, where like your, your app's having issues and somebody's actually, uh, SSHing into your box and trying to figure out like, okay, what's going on? which if, if that's happening, then I, I can totally see where the, the price is justified. But if the support is kind of like dropping off to where it's just, they don't do that kind of thing, then yeah, I can see why it's not so much of a, yeah, [00:24:27] David: We used to think of Heroku as like they were the platform team before we had our own platform team and they, they acted like it, which was great. [00:24:35] Jeremy: Yeah, I don't have, um, experience with, render, but I, I, I did, talk to someone from there, and it does seem like they're, they're trying to fill that role, um, so, yeah, hopefully, they and, and other companies, I guess like Vercel and things like that, um, they're, they're all trying to fill that space, [00:24:55] David: Yeah, cause, cause building our own internal platform, I mean it was the right thing to do, but it's, it's a, you can't just, you have to have a team on it, it's complicated, getting all the stuff in AWS to work the way you want it to work, to have it be kind of like Heroku, like it's not trivial. if I'm a one person company, I don't want to be messing around with that particularly. I want to just have it, you know, push it up and have it go and I'm willing to pay for that. So it seems logical that there would be competitors in that space. I'm glad there are. Hopefully that'll light a fire under, under everybody. [00:25:26] Jeremy: so in your case, it sounds like you moved to having your own platform team and stuff like that, uh, partly because of the compliance thing where you're like, we need our, we need to be isolated from the internet. We're going to go to AWS. If you didn't have that requirement, do you still think like that would have been the time to, to have your own platform team and manage that all yourself? [00:25:46] David: I don't know. We, we were thinking an issue that we were running into when we got bigger, um, was that, I mean, Heroku, it, It's obviously not as flexible as AWS, but it is still very flexible. And so we had a lot of internal documentation about this is how you use Heroku to do X, Y, and Z. This is how you set up a Stitch Fix app for Heroku. Like there was just the way that we wanted it to be used to sort of. Just make it all manageable. And so we were considering having a team spun up to sort of add some tooling around that to sort of make that a little bit easier for everybody. So I think there may have been something around there. I don't know if it would have been called a platform team. Maybe we call, we thought about calling it like developer happiness or because you got developer experience or something. We, we probably would have had something there, but. I do wonder how easy it would have been to fund that team with developers if we hadn't had these sort of business constraints around there. yeah, um, I don't know. You get to a certain size, you need some kind of manageability and consistency no matter what you're using underneath. So you've got to have, somebody has to own it to make sure that it's, that it's happening. [00:26:50] Jeremy: So even at your, your architect level, you still think it would have been a challenge to, to. Come to the executive team and go like, I need funding to build this team. [00:27:00] David: You know, certainly it's a challenge because everybody, you know, right? Nobody wants to put developers in anything, right? There are, there are a commodity and I mean, that is kind of the job of like, you know, the staff engineer or the architect at a company is you don't have, you don't have the power to put anybody on anything you, you have the power to Schedule a meeting with a VP or the CTO and they will listen to you. And that's basically, you've got to use that power to convince them of what you want done. And they're all reasonable people, but they're balancing 20 other priorities. So it would, I would have had to, it would have been a harder case to make that, Hey, I want to take three engineers. And have them write tooling to make Heroku easier to use. What? Heroku is not easy to use. Why aren't, you know, so you really, I would, it would be a little bit more of a stretch to walk them through it. I think a case could be made, but, definitely would take some more, more convincing than, than what was needed in our case. [00:27:53] Jeremy: Yeah. And I guess if you're able to contrast that with, you were saying, Oh, I need three people to help me make Heroku easier. Your actual platform team on AWS, I imagine was much larger, right? [00:28:03] David: Initially it was, there was, it was three people did the initial move over. And so by the time we went public, we'd been on this new system for, I don't know, six to nine months. I can't remember exactly. And so at that time the platform team was four or five people, and I, I mean, so percentage wise, right, the engineering team was maybe almost 200, 150, 200. So percentage wise, maybe a little small, I don't know. but it kind of gets back to the power of like the rails and the one person framework. Like everything we did was very much the same And so the Rails app that managed the deployment was very simple. The, the command line app, even the Go one with all of its verbosity was very, very simple. so it was pretty easy for that small team to manage. but, Yeah, so it was sort of like for redundancy, we probably needed more than three or four people because you know, somebody goes out sick or takes a vacation. That's a significant part of the team. But in terms of like just managing the complexity and building it and maintaining it, like it worked pretty well with, you know, four or five people. Where Rails fits in vs other technology [00:29:09] Jeremy: So during the Keynote today, they were talking about how companies like GitHub and Shopify and so on, they're, they're using Rails and they're, they're successful and they're fairly large. but I think the thing that was sort of unsaid was the fact that. These companies, while they use Rails, they use a lot of other, technology as well. And, and, and kind of increasing amounts as well. So, I wonder from your perspective, either from your experience at StitchFix or maybe going forward, what is the role that, that Ruby and Rails plays? Like, where does it make sense for that to be used versus like, Okay, we need to go and build something in Java or, you know, or Go, that sort of thing? [00:29:51] David: right. I mean, I think for like your standard database backed web app, it's obviously great. especially if your sort of mindset bought into server side rendering, it's going to be great at that. so like internal tools, like the customer service dashboard or... You know, something for like somebody who works at a company to use. Like, it's really great because you can go super fast. You're not going to be under a lot of performance constraints. So you kind of don't even have to think about it. Don't even have to solve it. You can, but you don't have to, where it wouldn't work, I guess, you know, if you have really strict performance. Requirements, you know, like a, a Go version of some API server is going to use like percentages of what, of what Rails would use. If that's meaningful, if what you're spending on memory or compute is, is meaningful, then, then yeah. That, that becomes worthy of consideration. I guess if you're, you know, if you're making a mobile app, you probably need to make a mobile app and use those platforms. I mean, I guess you can wrap a Rails app sort of, but you're still making, you still need to make a mobile app, that does something. yeah. And then, you know, interestingly, the data science part of Stitch Fix was not part of the engineering team. They were kind of a separate org. I think Ruby and Rails was probably the only thing they didn't use over there. Like all the ML stuff, everything is either Java or Scala or Python. They use all that stuff. And so, yeah, if you want to do AI and ML with Ruby, you, it's, it's hard cause there's just not a lot there. You really probably should use Python. It'll make your life easier. so yeah, those would be some of the considerations, I guess. [00:31:31] Jeremy: Yeah, so I guess in the case of, ML, Python, certainly, just because of the, the ecosystem, for maybe making a command line application, maybe Go, um, Go or Rust, perhaps, [00:31:44] David: Right. Cause you just get a single binary. Like the problem, I mean, I wrote this book on Ruby command line apps and the biggest problem is like, how do I get the Ruby VM to be anywhere so that it can then run my like awesome scripts? Like that's kind of a huge pain. (laughs) So [00:31:59] Jeremy: and then you said, like, if it's Very performance sensitive, which I am kind of curious in, in your experience with the companies you've worked at, when you're taking on a project like that, do you know up front where you're like, Oh, the CPU and memory usage is going to be a problem, or is it's like you build it and you're like, Oh, this isn't working. So now I know. [00:32:18] David: yeah, I mean, I, I don't have a ton of great experience there at Stitch Fix. The biggest expense the company had was the inventory. So like the, the cost of AWS was just de minimis compared to all that. So nobody ever came and said, Hey, you've got to like really save costs on, on that stuff. Cause it just didn't really matter. at the, the mental health startup I was at, it was too early. But again, the labor costs were just far, far exceeded the amount of money I was spending on, on, um, you know, compute and infrastructure and stuff like that. So, Not knowing anything, I would probably just sort of wait and see if it's a problem. But I suppose you always take into account, like, what am I actually building? And like, what does this business have to scale to, to make it worthwhile? And therefore you can kind of do a little bit of planning ahead there. But, I dunno, I think it would kind of have to depend. [00:33:07] Jeremy: There's a sort of, I guess you could call it a meme, where people say like, Oh, it's, it's not, it's not Rails that's slow, it's the, the database that's slow. And, uh, I wonder, is that, is that accurate in your experience, or, [00:33:20] David: I mean, most of the stuff that we had that was slow was the database, because like, it's really easy to write a crappy query in Rails if you're not, if you're not careful, and then it's really easy to design a database that doesn't have any indexes if you're not careful. Like, you, you kind of need to know that, But of course, those are easy to fix too, because you just add the index, especially if it's before the database gets too big where we're adding indexes is problematic. But, I think those are just easy performance mistakes to make. Uh, especially with Rails because you're not, I mean, a lot of the Rails developers at Citrix did not know SQL at all. I mean, they had to learn it eventually, but they didn't know it at all. So they're not even knowing that what they're writing could possibly be problematic. It's just, you're writing it the Rails way and it just kind of works. And at a small scale, it does. And it doesn't matter until, until one day it does. [00:34:06] Jeremy: And then in, in the context of, let's say, using ActiveRecord and instantiating the objects, or, uh, the time it takes to render templates, that kinds of things, to, at least in your experience, that wasn't such of an issue. [00:34:20] David: No, and it was always, I mean, whenever we looked at why something was slow, it was always the database and like, you know, you're iterating over some active records and then, and then, you know, you're going into there and you're just following this object graph. I've got a lot of the, a lot of the software at Stitch Fix was like internal stuff and it was visualizing complicated data out of the database. And so if you didn't think about it, you would just start dereferencing and following those relationships and you have this just massive view and like the HTML is fine. It's just that to render this div, you're. Digging into some active record super deep. and so, you know, that was usually the, the, the problems that we would see and they're usually easy enough to fix by making an index or. Sometimes you do some caching or something like that. and that solved most of the, most of the issues [00:35:09] Jeremy: The different ways people learn [00:35:09] Jeremy: so you're also the author of the book, Sustainable Web Development with Ruby on Rails. And when you talk to people about like how they learn things, a lot of them are going on YouTube, they're going on, uh, you know, looking for blogs and things like that. And so as an author, what do you think the role is of, of books now? Yeah, [00:35:29] David: I have thought about this a lot, because I, when I first got started, I'm pretty old, so books were all you had, really. Um, so they seem very normal and natural to me, but... does someone want to sit down and read a 400 page technical book? I don't know. so Dave Thomas who runs Pragmatic Bookshelf, he was on a podcast and was asked the same question and basically his answer, which is my answer, is like a long form book is where you can really lay out your thinking, really clarify what you mean, really take the time to develop sometimes nuanced, examples or nuanced takes on something that are Pretty hard to do in a short form video or in a blog post. Because the expectation is, you know, someone sends you an hour long YouTube video, you're probably not going to watch that. Two minute YouTube video is sure, but you can't, you can't get into so much, kind of nuanced detail. And so I thought that was, was right. And that was kind of my motivation for writing. I've got some thoughts. They're too detailed. It's, it's too much set up for a blog post. There's too much of a nuanced element to like, really get across. So I need to like, write more. And that means that someone's going to have to read more to kind of get to it. But hopefully it'll be, it'll be valuable. one of the sessions that we're doing later today is Ruby content creators, where it's going to be me and Noel Rappin and Dave Thomas representing the old school dudes that write books and probably a bunch of other people that do, you know, podcasts videos. It'd be interesting to see, I really want to know how do people learn stuff? Because if no one reads books to learn things, then there's not a lot of point in doing it. But if there is value, then, you know. It should be good and should be accessible to people. So, that's why I do it. But I definitely recognize maybe I'm too old and, uh, I'm not hip with the kids or, or whatever, whatever the case is. I don't know. [00:37:20] Jeremy: it's tricky because, I think it depends on where you are in the process of learning that thing. Because, let's say, you know a fair amount about the technology already. And you look at a book, in a lot of cases it's, it's sort of like taking you from nothing to something. And so you're like, well, maybe half of this isn't relevant to me, but then if I don't read it, then I'm probably missing a lot still. And so you're in this weird in be in between zone. Another thing is that a lot of times when people are trying to learn something, they have a specific problem. And, um, I guess with, with books, it's, you kind of don't know for sure if the thing you're looking for is going to be in the book. [00:38:13] David: I mean, so my, so my book, I would not say as a beginner, it's not a book to learn how to do Rails. It's like you already kind of know Rails and you want to like learn some comprehensive practices. That's what my book is for. And so sometimes people will ask me, I don't know Rails, should I get your book? And I'm like, no, you should not. but then you have the opposite thing where like the agile web development with Rails is like the beginner version. And some people are like, Oh, it's being updated for Rails 7. Should I get it? I'm like, probably not because How to go from zero to rails hasn't changed a lot in years. There's not that much that's going to be new. but, how do you know that, right? Hopefully the Table of Contents tells you. I mean, the first book I wrote with Pragmatic, they basically were like, The Table of Contents is the only thing the reader, potential reader is going to have to have any idea what's in the book. So, You need to write the table of contents with that in mind, which may not be how you'd write the subsections of a book, but since you know that it's going to serve these dual purposes of organizing the book, but also being promotional material that people can read, you've got to keep that in mind, because otherwise, how does anybody, like you said, how does anybody know what's, what's going to be in there? And they're not cheap, I mean, these books are 50 bucks sometimes, and That's a lot of money for people in the U. S. People outside the U. S. That's a ton of money. So you want to make sure that they know what they're getting and don't feel ripped off. [00:39:33] Jeremy: Yeah, I think the other challenge is, at least what I've heard, is that... When people see a video course, for whatever reason, they, they set, like, a higher value to it. They go, like, oh, this video course is, 200 dollars and it's, like, seems like a lot of money, but for some people it's, like, okay, I can do that. But then if you say, like, oh, this, this book I've been researching for five years, uh, I want to sell it for a hundred bucks, people are going to be, like no. No way., [00:40:00] David: Yeah. Right. A hundred bucks for a book. There's no way. That's a, that's a lot. Yeah. I mean, producing video, I've thought about doing video content, but it seems so labor intensive. Um, and it's kind of like, It's sort of like a performance. Like I was mentioning before we started that I used to play in bands and like, there's a lot to go into making an even mediocre performance. And so I feel like, you know, video content is the same way. So I get that it like, it does cost more to produce, but, are you getting more information out of it? I, that, I don't know, like maybe not, but who knows? I mean, people learn things in different ways. So, [00:40:35] Jeremy: It's just like this perception thing, I think. And, uh, I'm not sure why that is. Um, [00:40:40] David: Yeah, maybe it's newer, right? Maybe books feel older so they're easier to make and video seems newer. I mean, I don't know. I would love to talk to engineers who are like... young out of college, a few years into their career to see what their perception of this stuff is. Cause I mean, there was no, I mean, like I said, I read books cause that's all there was. There was no, no videos. You, you go to a conference and you read a book and that was, that was all you had. so I get it. It seems a whole video. It's fancier. It's newer. yeah, I don't know. I would love to hear a wide variety of takes on it to see what's actually the, the future, you know? [00:41:15] Jeremy: sure, yeah. I mean, I think it probably can't just be one or the other, right? Like, I think there are... Benefits of each way. Like, if you have the book, you can read it at your own pace without having to, like, scroll through the video, and you can easily copy and paste the, the code segments, [00:41:35] David: Search it. Go back and forth. [00:41:36] Jeremy: yeah, search it. So, I think there's a place for it, but yeah, I think it would be very interesting, like you said, to, to see, like, how are people learning, [00:41:45] David: Right. Right. Yeah. Well, it's the same with blogs and podcasts. Like I, a lot of podcasters I think used to be bloggers and they realized that like they can get out what they need by doing a podcast. And it's way easier because it's more conversational. You don't have to do a bunch of research. You don't have to do a bunch of editing. As long as you're semi coherent, you can just have a conversation with somebody and sort of get at some sort of thing that you want to talk about or have an opinion about. And. So you, you, you see a lot more podcasts and a lot less blogs out there because of that. So it's, that's kind of like the creators I think are kind of driving that a little bit. yeah. So I don't know. [00:42:22] Jeremy: Yeah, I mean, I can, I can say for myself, the thing about podcasts is that it's something that I can listen to while I'm doing something else. And so you sort of passively can hopefully pick something up out of that conversation, but... Like, I think it's maybe not so good at the details, right? Like, if you're talking code, you can talk about it over voice, but can you really visualize it? Yeah, yeah, yeah. I think if you sit down and you try to implement something somebody talked about, you're gonna be like, I don't know what's happening. [00:42:51] David: Yeah. [00:42:52] Jeremy: So, uh, so, so I think there's like these, these different roles I think almost for so like maybe you know the podcast is for you to Maybe get some ideas or get some familiarity with a thing and then when you're ready to go deeper You can go look at a blog post or read a book I think video kind of straddles those two where sometimes video is good if you want to just see, the general concept of a thing, and have somebody explain it to you, maybe do some visuals. that's really good. but then it can also be kind of detailed, where, especially like the people who stream their process, right, you can see them, Oh, let's, let's build this thing together. You can ask me questions, you can see how I think. I think that can be really powerful. at the same time, like you said, it can be hard to say, like, you know, I look at some of the streams and it's like, oh, this is a three hour stream and like, well, I mean, I'm interested. I'm interested, but yeah, it's hard enough for me to sit through a, uh, a three hour movie, [00:43:52] David: Well, then that, and that gets into like, I mean, we're, you know, we're at a conference and they, they're doing something a little, like, there are conference talks at this conference, but there's also like. sort of less defined activities that aren't a conference talk. And I think that could be a reaction to some of this too. It's like I could watch a conference talk on, on video. How different is that going to be than being there in person? maybe it's not that different. Maybe, maybe I don't need to like travel across the country to go. Do something that I could see on video. So there's gotta be something here that, that, that meets that need that I can't meet any other way. So it's all these different, like, I would like to think that's how it is, right? All this media all is a part to play and it's all going to kind of continue and thrive and it's not going to be like, Oh, remember books? Like maybe, but hopefully not. Hopefully it's like, like what you're saying. Like it's all kind of serving different purposes that all kind of work together. Yeah. [00:44:43] Jeremy: I hope that's the case, because, um, I don't want to have to scroll through too many videos. [00:44:48] David: Yeah. The video's not for me. Large Language Models [00:44:50] Jeremy: I, I like, I actually do find it helpful, like, like I said, for the high level thing, or just to see someone's thought process, but it's like, if you want to know a thing, and you have a short amount of time, maybe not the best, um, of course, now you have all the large language model stuff where you like, you feed the video in like, Hey, tell, tell, tell me, uh, what this video is about and give me the code snippets and all that stuff. I don't know how well it works, but it seems [00:45:14] David: It's gotta get better. Cause you go to a support site and they're like, here's how to fix your problem, and it's a video. And I'm like, can you just tell me? But I'd never thought about asking the AI to just look at the video and tell me. So yeah, it's not bad. [00:45:25] Jeremy: I think, that's probably where we're going. So it's, uh, it's a little weird to think about, but, [00:45:29] David: yeah, yeah. I was just updating, uh, you know, like I said, I try to keep the book updated when new versions of Rails come out, so I'm getting ready to update it for Rails 7. 1 and in Amazon's, Kindle Direct Publishing as their sort of backend for where you, you know, publish like a Kindle book and stuff, and so they added a new question, was AI used in the production of this thing or not? And if you answer yes, they want you to say how much, And I don't know what they're gonna do with that exactly, but I thought it was pretty interesting, cause I would be very disappointed to pay 50 for a book that the AI wrote, right? So it's good that they're asking that? Yeah. [00:46:02] Jeremy: I think the problem Amazon is facing is where people wholesale have the AI write the book, and the person either doesn't review it at all, or maybe looks at a little, a little bit. And, I mean, the, the large language model stuff is very impressive, but If you have it generate a technical book for you, it's not going to be good. [00:46:22] David: yeah. And I guess, cause cause like Amazon, I mean, think about like Amazon scale, like they're not looking at the book at all. Like I, I can go click a button and have my book available and no person's going to look at it. they might scan it or something maybe with looking for bad words. I don't know, but there's no curation process there. So I could, yeah. I could see where they could have that, that kind of problem. And like you as the, as the buyer, you don't necessarily, if you want to book on something really esoteric, there are a lot of topics I wish there was a book on that there isn't. And as someone generally want to put it on Amazon, I could see a lot of people buying it, not realizing what they're getting and feeling ripped off when it was not good. [00:47:00] Jeremy: Yeah, I mean, I, I don't know, if it's an issue with the, the technical stuff. It probably is. But I, I know they've definitely had problems where, fiction, they have people just generating hundreds, thousands of books, submitting them all, just flooding it. [00:47:13] David: Seeing what happens. [00:47:14] Jeremy: And, um, I think that's probably... That's probably the main reason why they ask you, cause they want you to say like, uh, yeah, you said it wasn't. And so now we can remove your book. [00:47:24] David: right. Right. Yeah. Yeah. [00:47:26] Jeremy: I mean, it's, it's not quite the same, but it's similar to, I don't know what Stack Overflow's policy is now, but, when the large language model stuff started getting big, they had a lot of people answering the questions that were just. Pasting the question into the model [00:47:41] David: Which because they got it from [00:47:42] Jeremy: and then [00:47:43] David: The Got model got it from Stack Overflow. [00:47:45] Jeremy: and then pasting the answer into Stack Overflow and the person is not checking it. Right. So it's like, could be right, could not be right. Um, cause, cause to me, it's like, if, if you generate it, if you generate the answer and the answer is right, and you checked it, I'm okay with that. [00:48:00] David: Yeah. Yeah. [00:48:01] Jeremy: but if you're just like, I, I need some karma, so I'm gonna, I'm gonna answer these questions with, with this bot, I mean, then maybe [00:48:08] David: I could have done that. You're not adding anything. Yeah, yeah. [00:48:11] Jeremy: it's gonna be a weird, weird world, I think. [00:48:12] David: Yeah, no kidding. No kidding. [00:48:15] Jeremy: that's a, a good place to end it on, but is there anything else you want to mention, [00:48:19] David: No, I think we covered it all just yeah, you could find me online. I'm Davetron5000 on Ruby. social Mastodon, I occasionally post on Twitter, but not that much anymore. So Mastodon's a place to go. [00:48:31] Jeremy: David, thank you so much [00:48:32] David: All right. Well, thanks for having me.

Nov 15, 2023 • 44min

ChaelCodes on The Joy of Programming Games and Streaming (RubyConf 2023)

Episode Notes Rachael Wright-Munn (ChaelCodes) talks about her love of programming games (games with programming elements in them, not how to make games!), starting her streaming career with regex crosswords, and how streaming games and open source every week led her to a voice acting role in one of her favorite programming games. Recorded at RubyConf 2023 in San Diego. mastodon twitch Personal website Programming Games mentioned: Regex Crossword SHENZHEN I/O EXAPUNKS 7 Billion Humans One Dreamer Code Rom@ntic Bitburner Transcript You can help edit this transcript on GitHub. Jeremy: I'm here at RubyConf San Diego with Rachel Wright-Munn, and she goes by Chaelcodes online. Thanks for joining me today. Rachael: Hi, everyone. Hi, Jeremy. Really excited to be here. Jeremy: So probably the first thing I'll ask about is on your web page, and I've noticed you have streams, you say you have an interest in not just regular games, but programming games, so. Rachael: Oh my gosh, I'm so glad you asked about this. Okay, so I absolutely love programming games. When I first started streaming, I did it with Regex Crossword. What I really like about it is the fact that you have this joyful environment where you can solve puzzles and work with programming, and it's really focused on the experience and the joy. Are you familiar with Zach Barth of Zachtronics? Jeremy: Yeah. So, I've tried, what was it? There's TIS-100. And then there's the, what was the other one? He had one that's... Rachael: Opus Magnum? Shenzhen I/O? Jeremy: Yeah, Shenzhen I/O. Rachael: Oh, my gosh. Shenzhen I/O is fantastic. I absolutely love that. The whole conceit of it, which is basically that you're this electronics engineer who's just moved to Shenzhen because you can't find a job in the States. And you're trying to like build different solutions for these like little puzzles and everything. It was literally one of the, I think that was the first programming game that really took off just because of the visuals and everything. And it's one of my absolute favorites. I really like what he says about it in terms of like testing environments and the developer experience. Cause it's built based on assembly, right? He's made a couple of modifications. Like he's talked about it before where it's like The memory allocation is different than what it would actually look like in assembly and the way the registers are handled I believe is different, I wouldn't think of assembly as something that's like fun to write, but somehow in this game it is. How far did you get in it? Jeremy: Uh, so I didn't get too far. So, because like, I really like the vibe and sort of the environment and the whole concept, right, of you being like, oh, you've been shipped off to China because that's the only place that these types of jobs are, and you're working on these problems with bad documentation and stuff like that. And I like the whole concept, but then the actual writing of the software, I was like, I don't know. Rachael: And it's so hard, one of the interesting things about that game is you have components that you drop on the board and you have to connect them together and wire them, but then each component only has a specific number of lines. So like half the time I would be like, oh, I have this solution, but I don't have enough lines to actually run it or I can't fit enough components, then you have to go in and refactor it and everything. And it's just such a, I don't know, it's so much fun for me. I managed to get through all of the bonus levels and actually finish it. Some of them are just real, interesting from both a story perspective and interesting from a puzzle perspective. I don't wanna spoil it too much. You end up outside Shenzhen, I'll just say that. Jeremy: OK. That's some good world building there. Rachael: Yeah. Jeremy: Because in your professional life, you do software development work. So I wonder, what is it about being in a game format where you're like, I'm in it. I can do it more. And this time, I'm not even being paid. I'm just doing it for fun. Rachael: I think for me, software development in general is a very joyful experience. I love it. It's a very human thing. If you think about it like math, language, all these things are human concepts and we built upon that in order to build software in our programs and then on top of that, like the entire purpose of everything that we're building is for humans, right? Like they don't have rats running programs, you know what I mean? So when I think about human expression and when I think about programming, these two concepts are really closely linked for me and I do see it as joyful, But there are a lot of things that don't spark joy in our development processes, right? Like lengthy test suites, or this exhausting back and forth, or sometimes the designs, and I just, I don't know how to describe it, but sometimes you're dealing with ugly code, sometimes you're dealing with code smells, and in your professional developer life, sometimes you have to put up with that in order to ship features. But when you're working in a programming game, It's just about the experience. And also there is a correct solution, not necessarily a correct solution, but like there's at least one correct solution. You know for a fact that there's, that it's a solvable problem. And for me, that's really fun. But also the environment and the story and the world building is fun as well, right? So one of my favorite ones, we mentioned Shenzhen, but Zachtronics also has Exapunks. And that one's really fun because you have been infected by a disease. And like a rogue AI is the only one that can provide you with the medicine you need to prevent it. And what this disease is doing is it is converting parts of your body into like mechanical components, like wires and everything. So what you have to do as an engineer is you have to write the code to keep your body running. Like at one point, you were literally programming your heart to beat. I don't have problems like that in my day job. In my day job, it's like, hey, can we like charge our customers more? Like, can we put some banners on these pages? Like, I'm not hacking anybody's hearts to keep them alive. Jeremy: The stakes are a little more interesting. Yeah, yeah. Rachael: Yeah, and in general, I'm a gamer. So like having the opportunity to mix two of my passions is really fun. Jeremy: That's awesome. Yeah, because that makes sense where you were saying that there's a lot of things in professional work where it's you do it because you have to do it. Whereas if it's in the context of a game, they can go like, OK, we can take the fun problem solving part. We can bring in the stories. And you don't have to worry about how we're going to wrangle up issue tickets. Rachael: Yeah, there are no Jira tickets in programming games. Jeremy: Yeah, yeah. Rachael: I love what you said there about the problem solving part of it, because I do think that that's an itch that a lot of us as engineers have. It's like we see a problem, and we want to solve it, and we want to play with it, and we want to try and find a way to fix it. And programming games are like this really small, compact way of getting that dopamine hit. Jeremy: For sure. Yeah, it's like. Sometimes when you're doing software for work or for an actual purpose, there may be a feeling where you want to optimize something or make it look really nice or perform really well. And sometimes it just doesn't matter, right? It's just like we need to just put it out and it's good enough. Whereas if it's in the context of a game, you can really focus on like, I want to make this thing look pretty. I want to feel good about this thing I'm making. Rachael: You can make it look good, or you can make it look ugly. You don't have to maintain it. After it runs, it's done. Right, right, right. There's this one game. It's 7 Billion Humans. And it's built by the creators of World of Goo. And it's like this drag and drop programming solution. And what you do is you program each worker. And they go solve a puzzle. And they pick up blocks and whatever. But they have these shredders, right? And the thing is, you need to give to the shredder if you have like a, they have these like little data blocks that you're handing them. If you're not holding a data block and you give to the shredder, the worker gives themself to the shredder. Now that's not ideal inside a typical corporate workplace, right? Like we don't want employees shredding themselves. We don't want our workers terminating early or like anything like that. But inside the context of a game, in order to get the most optimal solution, They have like a lines of code versus fastest execution and sometimes in order to win the end like Lines of code. You just kind of have to shred all your workers at the, When I'm on stream and I do that when I'm always like, okay everybody close your eyes That's pretty good it's Yeah, I mean cuz like in the context of the game. Jeremy: I think I've seen where they're like little They're like little gray people with big eyes Yes, yes, yes, yes. Yeah, so it's like, sorry, people. It's for the good of the company, right? Rachael: It's for my optimal lines of code solution. I always draw like a, I always write a humane solution before I shred them. Jeremy: Oh, OK. So it's, you know, I could save you all, but I don't have to. Rachael: I could save you all, but I would really like the trophy for it. There's like a dot that's going to show up in the elevator bay if I shred you. Jeremy: It's always good to know what's important. But so at the start, you mentioned there was a regular expression crossword or something like that. Is that how you got started with all this? Rachael: My first programming game was Regex Crossword. I absolutely loved it. That's how I learned Regex. Rachael: I love it a lot. I will say one thing that's been kind of interesting is I learned Regex through Regex Crossword, which means there's actually these really interesting gaps in my knowledge. What was it? at Link Tech Retreat, they had like a little Regex puzzle, and it was like forward slash T and then a plus, right? And I was like, I have no idea what that character is, right? Like, I know all the rest of them. But the problem is that forward slash T is tab, and Regex crossword is a browser game. So you can't have a solution that has tab in it. And have that be easy for users. Also, the idea of like greedy evaluation versus lazy evaluation doesn't apply, because you're trying to find a word that satisfies the regex. So it's not necessarily about what the regex is going to take. So it's been interesting finding those gaps, but I really think that some of the value there was around how regex operates and the rules underlying it and building enough experience that I can now use the documentation to fill in any gaps. Jeremy: So the crossword, is it where you know the word and you have to write a regular expression to match it? Or what's the? Rachael: They give you regex. And there's a couple of different versions, right? The first one, you have two regex patterns. There's one going up and down, and there's one going left and right. And you have to fill the crossword block with something that matches both regular expressions. Rachael: Then we get into hexagonal ones. Yeah, where you have angles and a hexagon, and you end up with like three regular expressions. What's kind of interesting about that one is I actually think that the hexagonal regex crosswords are a little bit easier because you have more rules and constraints, which are more hints about what goes in that box. Jeremy: Interesting. OK, so it's the opposite of what I was thinking. They give you the regex rules, and then you put in a word that's going to satisfy all the regex you see. Rachael: Exactly. When I originally did it, they didn't have any sort of hints or anything like that. It was just empty. Now it's like you click a box, and then they've got a suggestion of five possible letters that could go in there. And it just breaks my heart. I liked the old version that was plainer, and didn't have any hints, and was harder. But I acknowledge that the new version is prettier, and probably easier, and more friendly. But I feel like part of the joy that comes from games, that comes from puzzles, It comes from the challenge, and I miss the challenge. Jeremy: I guess someone, it would be interesting to see people who are new to it, if they had tried the old way, if they would have bounced off of it. Rachael: I think you're probably right. I just want them to give me a toggle somewhere. Jeremy: Yeah, oh, so they don't even let you turn off the hints, they're just like, this is how it is. Rachael: Yep. Jeremy: Okay. Well, we know all about feature flags. Rachael: And how difficult they are to maintain in perpetuity. Jeremy: Yeah, but no, that sounds really cool because I think some things, like you can look up a lot of stuff, right? You can look up things about regex or look up how to use them. But I think without the repetition and without the forcing yourself to actually go through the motion, without that it's really hard to like learn and pick it up. Rachael: I completely agree with you. I think the repetition, the practice, and learning the paradigm and patterns is huge. Because like even though I didn't know what forward slash t plus was, I knew that forward slash t was going to be some sort of character type. Jeremy: Yeah, it kind of reminds me of, there was, I'm not sure if you've heard of Vim Adventures, but... Rachael: I did! I went through the free levels. I had a streamerversary and my chat had completed a challenge where I had to go learn Vim. So I played a little bit of Vim Adventures. Jeremy: So I guess it didn't sell you. Rachael: Nope, I got Vim Extensions turned on. Jeremy: Oh, you did? Rachael: Yeah, I have the Vim extension turned on in VS Code. So I play with a little bit of sprinkling of Vim in my everyday. Jeremy: It's kind of funny, because I am not a Vim user in the sense that I don't use it as my daily editor or anything like that. But I do the same thing with the extensions in the browser. I like being able to navigate with the keyboard and all that stuff. Rachael: Oh, that is interesting. That's interesting. You have a point like memorizing all of the different patterns when it comes to like Keyboard navigation and things like that is very similar to navigating in Vim. I often describe writing code in Vim is kind of like solving a puzzle in order to write your code So I think that goes back to that Puzzle feeling that puzzle solving feeling we were having we were talking about before. Jeremy: Yeah, I personally can't remember, but whenever I watch somebody who's, really good at using Vim, it is interesting to see them go, oh, yes, I will go to the fifth word, and I will swap out just this part. And it's all just a few keystrokes, yeah. Rachael: Very impressive. Can be done just as well with backspace and, like, keyboard, like, little arrows and everything. But there is something fun about it and it is... Faster-ish. Jeremy: Yeah, I think it's like I guess it depends on the person, but for some people it's like they, they can think and do things at the speed that they type, you know, and so for them, I guess the the flow of, I'm doing stuff super fast using all these shortcuts is probably helpful to them. Rachael: I was talking to someone last night who was saying that they don't even think about it in Vim anymore. They just do it. I'm not there yet. (laughs) Jeremy: Yeah, I'll probably never be there (laughs) But yeah, it is something to see when you've got someone who's really good at it. Rachael: Definitely. I'm kind of glad that my chat encouraged and pressured me to work with Vim. One of the really cool things is when I'm working on stuff, I'll sometimes be like, oh, I want to do this. Is there a command in Vim for that? And then I'll get multiple suggestions or what people think, and ideas for how I can handle things better. Someone recently told me that if you want to delete to the end of a line, you can use capital D. And this whole time I was doing lowercase d dollar sign. Jeremy: Oh, right, right, right. Yeah. Yeah, it's like there's so many things there that, I mean, we should probably talk about your experiences streaming. But that seems like a really great benefit that you can be working through a problem or just doing anything, really. And then there's people who they're watching, and they're like, I know how to do it better. And they'll actually tell you, yeah. Rachael: I think that being open to that is one of the things that's most important as a streamer. A lot of people get into this cycle where they're very defensive and where they feel like they have to be the expert. But one of the things that I love about my chat is the fact that they do come to me with these suggestions. And then I can be open to them, and I can learn from them. And what I can do is I can take those learnings from one person and pass it on to the other people in chat. I can become a conduit for all of us to learn. Jeremy: So when you first decided to start streaming, I guess what inspired you to give it a shot? Like, what were you thinking? Rachael: That's a great question. It's also kind of a painful question. So the company that I was working for, I found out that there were some pay issues with regards to me being a senior, promotion track, things like that. And it wasn't the first time this had happened, right? Like, I often find that I'm swapping careers every two to three years because of some miserable experience at the company. Like you start and the first year is great. It's fantastic. It's awesome. But at the end of it, you're starting to see the skeletons and that two to three years later you're burnt out. And what I found was that every two to three years I was losing everything, right? Like all of my library of examples, the code that I would reference, like that's in their private repo. When it came to my professional network, the co -workers that liked and respected me, we had always communicated through the workplace Slack. So it's really hard to get people to move from the workplace Slack to like Instagram or Twitter or one of those other places if that's not where, if that's not a place where you're already used to talking to them. And then the other thing is your accomplishments get wiped out, right? Like when you start at the next company and you start talking about promotion and things like that, the work that you did at previous companies doesn't matter. They want you to be a team lead at that company. They want you to lead a massive project at that company and that takes time. It takes opportunities and Eventually, I decided that I wanted to exist outside my company. Like I wanted to have a reputation that went beyond that and that's what originally inspired me to stream And it's pretty hard to jump from like oh. I got really frustrated and burnt out at my company to I've got it I'm gonna do some regex crossword on stream, but honestly, that's what it was right was I just wanted to slowly build this reputation in this community outside of of my company and it's been enormously valuable in terms of my confidence, in terms of my opportunities. I've been able to pick up some really interesting jobs and I'm able to leverage some of those experiences in really clear professional ways and it's really driven me to contribute more to open source. I mentioned that I have a lot of people like giving me advice and suggestions and feedback. That's enormously helpful when you're going out there and you're trying to like get started in open source and you're trying to build that confidence and you're trying to build that reputation. I often talk about having a library of examples, right? Like your best code that you reference again and again and again. If I'm streaming on Twitch, everything that I write has to be open source because I'm literally showing it on video, right? So it's really encouraged me to build that out. And now when I'm talking to my coworkers and companies, I can be like, oh, we need to talk about single table inheritance. I did that in Hunter's Keepers. Why don't we go pull that up and we'll take a look at it. Or are we building a Docker image? I did that in Hunter's Keepers and Conf Buddies. Why don't we look at these, compare them, and see if we can get something working here, right? Like I have all of these examples, and I even have examples from other apps as well. Like I added Twitch Clips to 4M. So when I want to look at how to build a liquid tag, because Jekyll uses liquid tags as well. So when I'm looking at that, I can hop to those examples and hop between them, and I'm never going to lose access to them. Jeremy: Yeah, I mean, that's a really good point where I think a lot of people, they do their work at their job and it's never going to be seen by anyone and you can sort of talk about it, but you can't actually show anybody what you did. So it's like, and I think to that point too, is that there's some knowledge that is very domain specific or specific to that company. And so when you're actually doing open source work, it's something that anybody can pick up and use and has utility way beyond just your company. And the whole point of creating this record, that makes a lot of sense too, because if I wanna know if you know how to code, I can just see like, wow, she streams every Thursday. She's clearly she knows what she's doing and you know, you have these also these open source contributions as well So it's it's sort of like it's not this question of if I interview you It's it's not I'm just going off of your word that and I believe what you're saying. But rather it's kind of the proof is all it's all out there. Rachael: Oh, definitely if I were to think about my goals and aspirations for the future I've been doing this for four years still continuing But I think I would like to get to the point where I don't really have to interview. Where an interview is more of a conversation between me and somebody who already knows they want to hire me. Jeremy: Have you already started seeing a difference? Like you've been streaming for about four years I think Rachael: I had a really interesting job for about eight months doing developer relations with New Relic. That was a really interesting experience. And I think it really pushed the boundaries of what I understood myself to be capable of because I was able to spend 40 hours a week really focused on content creation, on blogging, on podcasting, on YouTube videos and things like that. Obviously there was a lot of event organization and things like that as well. But a lot of the stuff that came out of that time is some of my best work. Like I, I'm trying to remember exactly what I did while I was at New Relic, but I saw a clear decrease afterwards. But yeah, I think that was probably close to the tipping point. I don't for sure know if I'm there yet, right? Like you never know if you're at the point where you don't have to interview anymore until you don't have to interview. But the last two jobs, no, I haven't had to interview. Jeremy: So, doing it full -time, how did you feel about that versus having a more traditional lead or software developer role? Rachael: It was definitely a trade-off. So I spent a lot less time coding and a lot more time with content, and I think a little bit of it was me trying to balance the needs and desires of my audience against the needs and desires of my company. For me, and this is probably going to hurt my chances of getting one of those jobs where I don't have to interview in the future, but my community comes first, right? They're the people who are gonna stick with me when I swap between jobs, but that was definitely something that I constantly had to think about is like, how do I balance what my company wants from me with the responsibility that I have to my community? But also like my first talk, your first open source contribution, which was at RubyConf Denver, Like, that was written while I was at New Relic. Like, would I have had the time to work on a talk in addition to the streaming schedule and everything else? Um, for a period of time, I was hosting Ruby Galaxy, which was a virtual meetup. It didn't last very long, and we have deprecated it. Um, I deprecated it before I left the company because I wanted to give it, like, a good, clean ending versus, um, necessarily having it, like, linger on and be a responsibility for other people. but... I don't think I would have done those if I was trying to balance it with my day job. So, I think that that was an incredible experience. That said, I'm very glad it's over. I'm very glad that the only people I'm beholden to are my community now. Jeremy: So, is it the sheer amount that you had to do that was the main issue? Or is it more that that tension between, like you said, serving your audience and your community versus serving your employer? Rachael: Oh, a lot of it was tension. A lot of it was hectic, event management in general. I think if you're like planning and organizing events, that's a very challenging thing to do. And it's something that kind of like goes down to the deadline, right? And it's something where everybody's trying to like scramble and pull things together and keep things organized. And that was something that I don't think I really enjoyed. I like to have everything like nice and planned out and organized and all that sort of stuff, and I don't think that that's Something that happens very often in event management at least not from my experience So these were like in -person events or what types of events like I actually skipped out before the in -person events. They would have been in -person events. We had future stack at New Relic, which is basically like this big gathering where you talk about things you can do with New Relic and that sort of stuff. We all put together talks for that. We put together an entire like. Oh gosh, I'm trying to remember the tool that we use, but it was something similar to gather round where you like interact with people. And there's just a lot that goes into that from marketing to event planning to coordinating with everyone. I'm grateful for my time at New Relic and I made some incredible friends and some incredible connections and I did a lot, but yeah, I'm very glad I'm not in DevRel anymore. I don't, if you ask any DevRel, They'll tell you it's hectic, they'll tell you it's chaotic, and they'll tell you it's a lot of work. Jeremy: Yeah. So it sounds like maybe the streaming and podcasting or recording videos, talks, that part you enjoy, but it's the I'm responsible for planning this event for all these people to, you know. That's the part where you're like, OK, maybe not for me. Rachael: Yeah, kind of. I describe myself as like a content creator because I like to just like dabble and make things, right? Like I like to think about like, what is the best possible way to craft this tweet or this post or like to sit there and be like, okay, how can I structure this blog post to really communicate what I want people to understand? When it comes to my streams, what I actually do is I start with the hero's journey as a concept. So every single stream, we start with an issue in the normal world, right? And then what we do is we get drawn into the chaos realm as we're like debugging and trying to build things and going Back and forth and there's code flying everywhere and the tests are red and then they're green and then they're red and then they're green and then finally at the end we come back to the normal world as we create this PR and, Submit it neither merge it or wait for maintainer feedback. And for me that Story arc is really key and I like I'm a little bit of an artist. I like the artistry of it. I like the artistry of the code, and I like the artistry of creating the content. I think I've had guests on the show before, and sometimes it's hard to explain to them, like, no, no, no, this is a code show. We can write code, and that's great, but that's not what it's about. It's not just about the end product. It's about bringing people along with us on the journey. And sometimes it's been three hours, and I'm not doing a great job of bringing people along on the journey so like you know I'm tooting my own horn a little bit here but like that is important to me. Jeremy: So when you're working through a problem, When you're doing it on stream versus you're doing it by yourself, what are the key differences in how you approach the problem or how you work through it? Rachael: I think it's largely the same. It's like almost exactly the same. What I always do is, when I'm on stream, I pause, I describe the problem, I build a test for it, and then I start working on trying to fix what's wrong. I'm a huge fan of test -driven development. The way I see it, you want that bug to be reproducible, and a test gives you the easiest way to reproduce it. For me, it's about being easy as much as it is about it being the right way or not. But yeah, I would say that I approach it largely in the same way. I was in the content creator open space a little bit earlier, and I had to give them a bit of a confession. There is one small difference when I'm doing something on stream versus when I'm doing something alone. Sometimes, I have a lot of incredible senior staff, smart, incredible people in my chat. I'll describe the problem in vivid detail, and then I'll take my time writing the test, and by the time I'm done writing the test, somebody will have figured out what the problem is, and talk back to me about it. I very rarely do that. It's more often when it's an ops or an infrastructure or something like that. A great example of this is like the other day I was having an issue, I mentioned the Vim extensions. If I do command P on the code section, Vim extensions was capturing that, and so it wasn't opening the file. So one of my chatters was like, oh, you know, you can fix that if you Google it. I was like, oh, I don't know. I mean, I could Google it, but it will take so long and distract from the stream. Literally less than 15 minutes later a chatter had replied with like, here's exactly what to add to your VS Code extension, and I knew that was gonna happen. So that's my little secret confession. That's the only difference when I'm debugging things on stream is sometimes I'll let chat do it for me. Jeremy: Yeah, that's a superpower right there. Rachael: It is, and I think that happens because I am open to feedback and I want people to engage with me and I support that and encourage that in my community. I think a lot of people sometimes get defensive when it comes to code, right? Like when it comes to the languages or the frameworks that we use, right? There's a little bit of insecurity because you dive so deep and you gain so much knowledge that you're kind of scared that there might be something that's just as good because it means you might not have made the right decision. And I think that affects us when it comes to code reviews. I think it affects us when we're like writing in public. And I think, yeah, and I think it affects a lot of people when they're streaming, where they're like, if I'm not the smartest person in the room, and why am I the one with a camera and a microphone? But I try to set that aside and be like, we're all learning here. Jeremy: And when people give that feedback, and it's good feedback, I think it's really helpful when people are really respectful about it and kind about it. Have you had any issues like having to moderate that or make sure it stays positive in the context of the stream? Rachael: I have had moderation issues before, right? Like, I'm a woman on the internet, I'm going to have moderation issues. But for me, when it comes to feedback and suggestions, I try to be generous with my interpretation and my understanding of what they're going with. Like people pop in and they'll say things like, Ruby is dead, Rails is dead. And I have commands for that to like remind them, no, actually Twitch is a Rails app. So like, no, it's definitely not dead. You just used it to send a message. But like, I try to be understanding of where people are coming from and to meet them where they are, even if they're not being the most respectful. And I think what I've actually noticed is that when I do that, their tone tends to change. So I have two honorary trolls in my chat, Kego and John Sugar, and they show up and they troll me pretty frequently. But I think that that openness, that honesty, like that conversation back and forth it tends to defuse any sort of aggressive tension or anything. Jeremy: Yeah, and it's probably partly a function of how you respond, and then maybe the vibe of your stream in general probably brings people that are. Rachael: No, I definitely agree. I think so. Jeremy: Yeah. Rachael: It's the energy, you get a lot of the energy that you put out. Jeremy: And you've been doing this for about four years, and I'm having trouble picturing what it's even like, you know, you've never done a stream and you decide I'm gonna turn on the camera and I'm gonna code live and, you know, like, what was kind of going through your mind? How did you prepare? And like, what did, like, what was that like? Rachael: Thank you so much. That's a great question. So, actually, I started with Regex Crossword because it was structured, right? Like, I didn't necessarily know what I wanted to do and what I wanted to work on, but with Regex Crossword, you have a problem and you're solving it. It felt very structured and like a very controlled environment, and that gave me the confidence to get comfortable with, like, I'm here, I have a moderator, right? Like we're talking back and forth, I'm interacting with chatters, and that allowed me to kind of build up some skills. I'm actually a big fan of Hacktoberfest. I know a lot of people don't like it. I know a lot of people are like, oh, there are all these terrible spam PRs that show up during Hacktoberfest and open source repositories. But I'm a really big fan because I've always used it to push my boundaries, right? Like every single year, I've tried to take a new approach on it. So the first year that I did it, I decided that what I wanted to do to push my boundaries was to actually work on an application. So this one was called Hunter's Keepers. It was an app for managing characters in Monster of the Week and it was a Reels app because that's what I do professionally and that's what I like to work on. So I started just building that for Hacktoberfest and people loved it. It got a ton of engagement, way more than Regex Crossword and a little bit, like those open source streams continue to do better than the programming games, but I love the programming games so much that I don't wanna lose them, but that's where it kind of started, right? Was me sitting there and saying like, oh, I wanna work on these Rails apps. The Hacktoberfest after that one, And I was like, OK, I worked on my own app in the open, and I've been doing that for basically a year. I want to work on somebody else's app. So I pushed myself to contribute to four different open source repositories. One of the ones I pushed myself to work on was 4M. They did not have Twitch clips as embeds. They had YouTube videos and everything else. And I looked into how to do it, and I found out how liquids tags work, and I had a ton of other examples. I feel like extensions like that are really great contributions to open source because it's an easy way with a ton of examples that you can provide value to the project, and it's the sort of thing where, like, if you need it, other people probably need it as well. So I went and I worked on that, and I made some Twitch clips. And that was like one of my first like external open source project contributions. And that kind of snowballed, right? Because I now knew how to make a liquid tag. So when I started working on my Jekyll site, and I found out that they had liquid tags that were wrapped in gems, I used that as an opportunity to learn how to build a gem. And like how to create a gem that's wrapped around a liquid tag. And that exists now and is a thing that I've done. And so it's all of these little changes and moments that have stacked on top of each other, right? Like it's me going in and saying, OK, today I'd like to customize my alerts. Or like, today I'd like to buy a better microphone and set it up and do these changes. It's not something that changed all at once, right? It's just this small putting in the time day by day, improving. I say like the content gears are always grinding. You always need something new to do, right? And that's basically how my stream has gone for the last four years, is I'm just always looking for something new to do. We haven't talked about this yet, but I'm a voice actress in the programming video game, One Dreamer. And I actually collaborated with the creator of another one, Compressor, who like reached out to me about that Steam key. But the reason that I was able to talk to these people and I was able to reach out to them is rooted in Regex Crossword, right? Cause I finished Regex Crossword and Thursday night was like my programming game stream. And I loved them, so I kept doing them. And I kept picking up new games to play, and I kept exploring new things. So at the end of it, I ended up in this place where I had this like backlog in knowledge and history around programming games. So when Compressor was developed, I think he's like the creator, Charlie Bridge is like a VP at Arm or something. And okay, I should back up a little bit. Compressor is this game where you build CPUs with Steam. So it's like Steam Punk, like, electrical engineering components. Ah, it's so much fun. And like, the characters are all cool, because it's like you're talking to Nikola Tesla, and like Charles Babbage, and Ada Lovelace, and all this sort of stuff. It's just super fun. But the reason he reached out to me was because of that reputation, that backlog, that feedback. Like, when you think about how you became a developer, right, it's day by day, right? when you develop your experience. There's a moment where you look back and you're like, I just have all of these tools in my toolkit. I have all of these experiences. I've done all these things, and they just stack to become something meaningful. And that's kind of how it's gone with my stream, is just every single day I was trying to push, do something new. Well, not every day. Sometimes I have a lazy day, but like, but like I am continuously trying to find new ground to tread. Jeremy: Yeah, I mean that's really awesome thinking about how it went from streaming you solving these regex crosswords to all the way to ending up in one of these games that you play. Yeah, that's pretty pretty cool. Rachael: By the way, that is my absolute favorite game. So the whole reason that I'm in the game is because I played the demo on stream. Jeremy: Oh, nice. Rachael: And I loved it. Like I immediately was like, I'm going to go join the creators discord. This is going to be my game of the year. I can't wait to like make a video on this game. What's really cool about this one is that it uses programming as a mechanic and the story is the real driver. It's got this emotional impact and story. The colors are gorgeous and the way you interact with the world, like it is a genuine puzzle game where the puzzles are small, little, simple programming puzzles. And not like I walk up to this and like I solve a puzzle and the door opens. No, it's like you're interacting with different components in the world and wiring them together in order to get the code working. The whole premise is that there's an indie game developer who's gone through this really traumatic experience with his game, and now he's got the broken game, and he's trying to fix it in time for a really important game demo. I think it's like, it's like Vig something. Video game indie gaming. But what happened is I started following the creator, and I was super interested in them. And then he actually reached out to me about like the Steve workshop and then he was looking for people to voice act and I was like me please yes so yeah that's how I got involved with it yeah that's awesome it's like everything came full circle I guess it's like where you started and yeah no absolutely it's amazing. Jeremy: And so what was that experience like the voice acting bit? I'm assuming you didn't have professional experience with that before. Rachael: No, no, no, no. I had to do a lot of research into like how to voice act. My original ones were tossed out. I just, OK, so there's one line in it. This is going to this is so embarrassing. I can't believe I'm saying this on a podcast. There's one line that's like, it's a beautiful day to code. It's like a, because I'm an NPC, right? So like you can keep interacting with me and one of the like cycling ones is like, it's a beautiful day to code. Well, I tried to deliver it wistfully. Like I was staring out a window and I was like, it's a beautiful day to code. And every single person who heard it told me that it sounded like somewhat sensual, sexy. And I was dying because I had just sent this to this like indie game developer that like I appreciated and he replies back and he's like, I'm not sure if there was an audio issue with some of these, but could you like rerecord some of these? So I was very inexperienced. I did a lot of practicing, a lot of vocal exercises, but I think that it turned out well. Jeremy: That's awesome. So you kind of just kept trying and sending samples, or did they have anybody like try and coach you? Rachael: No, I just kept sending samples. I did watch some YouTube videos from like real voice actors. To try and like figure out what the vocal exercises were. One of the things that I did at first was I sent him like one audio, like the best one in my opinion. And he replied back being like, no, just record this like 10, 20 times. Send it to me and I'll chop the one I want. Jeremy: So the, anytime you did that, the one they picked, was it ever the one you thought was the best one? Rachael: Oh gosh, I don't think I actually like, Wow, I don't think I've gone back over the recordings to figure out which one I thought was the best one. Or like checked which one he picked out of the ones that I recorded. Oh, that's interesting. I'm going to have to do that after this. Jeremy: You're going to listen to all the, it's a beautiful day to code. Rachael: The final version is like a nice, neutral like, it's a beautiful day to code. One of the really cool things about that, though, is my character actually triggers the end of game scene, which is really fun. You know how you get a little hint that's like, oh, this is where the end of the game is, my character gets to do that. Jeremy: That's a big responsibility. Rachael: It is. I was so excited when I found out. Jeremy: That's awesome. Cool. Well, I think that's probably a good place to wrap it up on. But is there anything else you want to mention, or any games you want to recommend? Rachael: Oh, I think I mentioned all of them. I think if you look at Code Romantic, AXA Punks, Bitburner, is an idle JavaScript game that can be played in the browser where you write the custom files and build it and you're going off and hacking servers and stuff like that. It's a little light on story. One Dreamer, yeah. I think if you look at those four to five games, you will find one you like. Oh, it's 7 Billion Humans. Jeremy: Oh, right, yeah. Rachael: I haven't written the blog post yet, but that's my five programming video games that you should try if you've never done one before. 7 million humans is on mobile, so if you've got a long flight back from RubyConf, it might be a great choice. Jeremy: Oh, there you go. Rachael: Yeah. Other than that, it can be found at chael.codes, chael.codes/links for the socials, chael.codes/about for more information about me. And yeah, thank you so much for having me. This has been so much fun. Jeremy: Awesome. Well, Rachel, thank you so much for taking the time. Rachael: Thank you.

Sep 20, 2023 • 1h 1min

Daniel Zingaro and Leo Porter on learning to program with LLMs

Dr. Daniel Zingaro and Dr. Leo Porter are co-authors of the book Learn AI-Assisted Python Programming. Leo will teach an introductory computer science course this quarter at UCSD using this book. We discuss how tools like GitHub Copilot let people new to programming focus on breaking down problems instead of language syntax. Dr. Zingaro is an Associate Professor of Computer Science at University of Toronto Mississauga and Dr. Porter is an Associate Professor at University of California San Diego. This episode was originally posted on Software Engineering Radio. Topics covered: Making programming more accessible Teaching problem decomposition instead of language syntax The importance of reading and testing untrusted generated code The rise of throwaway or one-off code Concerns about relying on commercial tools Rethinking how to assess students Related Links Learn AI-Assisted Python Programming Leo Porter Daniel Zingaro GitHub Copilot Transcript You can help edit this transcript on GitHub. Note the timestamps and audio for this transcript will not completely match. Intro [00:00:00] Jeremy: Today I'm talking to Dr. Leo Porter. He's an associate teaching professor of computer science at the University of California San Diego, and he co-founded the computing education research laboratory there. I'm also joined by Dr. Daniel Zingaro who is an associate teaching professor of computer science at the University of Toronto. And he's also the author of the book, learn to Code by Solving Problems and the Book, Algorithmic Thinking. They are co-authors of the book, learn AI Assisted Python programming. Leo and Dan, welcome to Software Engineering Radio. [00:00:37] Leo: Thank you for having us, Jeremy. I really appreciate your podcast, so thanks. Great to be here. [00:00:41] Dan: Thanks Jeremy. Writing a book for Leo's CS1 class [00:00:43] Jeremy: The first thing we could start with is, is why this book? And, and why now? How did you decide on like, okay, this is the thing we need to do now. [00:00:51] Leo: So, uh, this is Dan. Uh, so Dan, um, like really early when LLMs first kind of were coming out and being seen on the scene for programming, uh, he started playing with them, uh, for programming projects. And I think Dan really quickly realized that they'd had this, a big impact on how we teach programming. so he reached out to me, uh, and said, I really need to give em a try. And, uh, after I played with them for a little while, I had the exact same realization that this is gonna change, uh, how we teach programming, uh, in a pretty dramatic way. So having realized that, having realized that we had to change our, uh, introductory CS1 courses, we knew we needed to do that, but in order to teach that class, we'd have to have a book that we could assign our students that that would go along with the class. And so we knew we had to change the class, but we also knew we had to have a book for it. And given the, the timeline to write books, we started in the book first. Um, and so that's how it got started. LLMs for Syntax, Humans for breaking down problems [00:01:45] Dan: I guess we figured out that our course had to change first, before we knew exactly, um, how it had to change. One thing we, um, learned early on was that the kinds of assignments we give in our introductory courses, they're just solved by, by these tools like ChatGPT and copilot. So, uh, we knew something had to change, and then it is just a matter of figuring out what. And so we spent, um, quite a bit of time with these tools and we started to realize that what's gonna change is the skills that our students need to learn, uh, to be effective using these tools. So like b before these tools, we would spend a lot of time teaching syntax. Um, and students struggle quite a bit with learning syntax, which I mean, it's very, it's, it's very frustrating, right? Cuz you can't even do anything until you get the syntax right? And you're getting all these errors like missing colons and, you know, mismatched braces and stuff like that. Uh, so it's actually good, that, the LLMs are doing the syntax for the students. But you know, just because that skill's, uh, not needed as much, uh, doesn't mean that there aren't still skills for students to learn. So instead of syntax, other things become more important. Uh, so for example, uh, Leo and I, realize that reading code is gonna be extremely important even more so than before. I think if, if that, if that's even possible. Uh, and that's because sometimes you're gonna get back code that just doesn't work. And so we realized that students are gonna need to be able to read, the response that they get to see if the code looks reasonable, or not, right? And then if the code, uh, I is unreasonable, then they need to read more code, uh, and look at other solutions, right, that they get from the, uh, LLM. Uh, there are other, uh, things they can do as well, like messing around with the prompt and so on. But they're gonna need to be able to read code, uh, throughout the process. And then, so we just kind of kept on using these tools and documenting the skills that students are gonna need. And we just kinda realized that all the skills students are gonna need are skills we would want to teach anyway. So like, uh, one more example is testing, right? So, students may now not have, uh, an understanding of every last detail of, you know, the Python language like they would before. And so then that makes testing even more important, right? Than it was they need to verify that the code they're getting is correct. And so they have to be very good at writing test cases. and, and, you know, similar, similar for debugging, we need our students to have strong debugging skills, again, even potentially stronger than before, right? Because if the code isn't working, they need to first determine what the code is doing to be able to fix it. And then I guess one more I'll mention is problem decomposition. And this is a big one. I think this is gonna come up a couple times probably in our talk today, but LLMs struggle when you give them tasks that are too large and students need to know how to break problems down into small components so that, that, LLM can solve each one and, you know, have a good chance of getting it right. [00:04:56] Leo: Yeah, I, I think, um, kind of to, to piggyback off of that, you, you may be hearing these skills and saying, oh, these are absolutely essential skills. Every software engineer should know, uh, these are being taught right now. Right? Um, and the answer is not really, like these aren't core topics in a lot of introductory CS classes because so much time is spent on syntax. And so fairly early on when we kind of realized these skills would be so essential, Uh, we got really excited because these are skills we want to teach in our classes, and the LLMs are now giving us the ability to do that more. [00:05:27] Dan: Mm-hmm. [00:05:28] Jeremy: I think that's interesting about the syntax comment because you were saying how reading is gonna be more important than ever because you have LLM generating the code. Um, and you need to understand that code that's being generated and understand that it does what it, uh, you think it does. And so I wonder if when you say you spend less time on syntax, is it because you feel like they're gonna generate this code and they're sort of organically gonna pick up syntax that way versus having to focus on it at the start? I'm just trying to picture what you see changing there. [00:06:05] Dan: Yeah, Jeremy. So, uh, I, I was, I guess speaking specifically about syntax errors, which don't generally happen when you're using LLMs, and I also agree with you, you need to know what the code is doing, but, um, you can do that without worrying about each specific piece of syntax. Like, um, you're gonna need to know what the keywords do for sure, but, missing, you know, brackets and colons and, uh, oh, there needs to be like a blank line here. indentation, uh, a lot of this kind of thing. Is done for the most part, correctly by the LLMs. So yeah, I agree with you. You need to be able to identify the structures. So in our, in our book actually, Leo and I have, um, a couple of chapters on reading code and, I don't think we ever break breakdown, a line of code into its individual tokens. We do talk about the main structures, like ifs and loops and functions and all that. but compared to other books, I, I think or other, uh, other ways of teaching where you would focus on the micro level, we try to focus on the line level now, cuz we want our students to be able to grasp what each line is doing, I guess more than each token. [00:07:27] Leo: Yeah, maybe to, to add to that a bit, it's almost, uh, if you think about the advent of block-based languages, it was to make sure that the, essentially the, the author can't make syntax mistakes, right? Is the whole purpose of kind of block-based languages. And they're, they're huge for introductory programming, especially in like K through 12. in a sense, LLMs do this because they'd never give you back wrong syntax, or they almost, almost never give you back wrong syntax. And so it takes away that kind of cognitive burden of making sure you handle the, the token level. as uh Dan was saying LLM generated code needs test cases to catch logical errors [00:08:00] Jeremy: I, I'm curious, so you said the syntax is correct, but what are the, the typical mistakes you see coming back from these LLMs? Is it a, a logical mistake or is it ever something that. Actually doesn't compile. I'm, I'm kind of curious what your experience has been. [00:08:19] Leo: I think the, uh, more common errors that we've been seeing are logical. So it misinterprets the prompt that you're giving it. It essentially tries to solve a problem that's different than what you're trying to solve. It may have bugs in it, so it is in fact trying to solve the right problem, but it, it's off by one, um, is maybe replicating some mistake that it found in, in the large code base. And so most mistakes are gonna be you need to write test cases, run it. That mistake is then gonna show up when the test cases catch it, and then you'll have to try to fix it. if the students can read the code, uh, if we train them well to read the code, often you'll look at the response. And if the response is just not even trying to solve the right problem, you can usually pick that up pretty quick. Uh, and I think, I think the students will be learn to do that and then they can just say, okay, this is clearly not the right answer. And, and use the different tools in say vscode to find another answer, and then pick one that's right or change their prompt to get a response that's right. Go through that whole flow. But then some point or other it will give an answer that looks right. And then I think all of us as software engineers know that even the code looks right, it may not be. And so then they have to actually write the test cases, get some level of confidence that's actually working right before they'll know. And so sometimes, sometimes, you know, really quick is that it's just clearly wrong at solving the wrong problem. And sometimes it looks right, but it actually has some bugs that need to be fixed. [00:09:49] Dan: I guess one thing that struck me is how much a change in the prompt can, can matter. Uh, Leo, you know, um, we've, we've seen this over and over again where we'll write a prompt. It seems fine to us. And then we'll realize, oh, there are actually two different ways of interpreting this. and, uh, the ambiguity of, of English strikes again, right? And so it's just amazing to me how clarifying the prompts, how many times that fixes the code. Not always. We've definitely have examples where that's not the case, but, um, more, more often than not, in my experience, changing the prompt, uh, appropriately has a bigger than, than, um, anticipated effect on the, on the code. It's amazing. [00:10:36] Leo: And for thinking of the prompt, uh, in terms of like doc strings for functions, uh, adding the test cases certainly help. Um, sometimes it is, surprising sometimes that you can add the test cases to the prompt and it'll still give you back code that does not actually pass that test case because it, vscode and copilot doesn't actually run the code that comes back from the LLM. Uh, but I do find the test cases do tend to help with the quality response you get back. [00:11:01] Jeremy: As a part of your prompt, you're asking it to implement some functionality, and you're also asking it to write these tests for that same functionality? [00:11:11] Leo: Oh no, sorry. I, I, it's more the, um, doc test kind of format. So it, it, um, you're writing, let's say you, you've written your function signature and then you have the description of the function in a doc string. And then at towards the end of the doc string, I'm articulating the test cases that I intend to use. Um, and the articulating the test cases that I intend to use helps it come with a better prompt. Um, I haven't found it to be great at writing test cases. I haven't spent a ton of time with this, but the time that I have spent, it tends to want to do almost like a brute force search of all possible inputs, uh, as opposed to doing, okay, well here's a couple common. Here are the edge cases. Now I can feel fairly good about it. It doesn't seem to have that, um, intuition yet. [00:11:55] Jeremy: [00:11:55] Leo: For the most part, we're writing the test cases our ourselves, and we're gonna be teaching the students how to write the test cases themselves [00:12:01] Dan: Yeah, Yeah. So Leo and I have actually made a conscious decision to have students write test cases from scratch. Even though you could play around with the LLM and have it, you know, try to generate test cases, whether it's flawed or not, we still want students to do this from scratch. We think that writing test cases is a skill we want our students to have. [00:12:23] Jeremy: Sometimes what these models will generate, like you were saying, has logical errors. And hopefully if you're writing the test cases, you've put some thought into 'em, and your test cases are actually checking the correct behavior. So then you have the LLM generate the implementation. It's running against tests where you know what the correct answer should be. And so if it generates something that's incorrect, you've, you've kind of caught it. You're not totally relying on it. Telling you everything is, is good, you know? Um, It's confidence in something that's like you personally can't see. It's just what the machine gave you. [00:13:05] Dan: Maybe it takes away one layer of uncertainty too, Jeremy, right? Like, so the code could be wrong, right? And then if it generates test cases, okay, the test cases could be wrong too. And maybe you get unlucky and two wrongs make a right and then your test cases pass for the wrong reason. So yeah, we really wanna hone this skill in our students. And, and like Leo said earlier, these intro courses used to be so full of low level syntax concerns that we, we didn't do testing properly. I mean, you know, we all try to cover testing, but I think we're gonna be able to cover it a lot more, detailed now. LLMs could encourage students to test more since their output is untrusted [00:13:41] Leo: And I, I think we're enthusiastic about, uh, how students will approach testing when you're working with the LLM is what we. This is fairly anecdotal, but uh, when they interact with us talking about testing, often students aren't testing their code because they wrote it. And so of course it's Right. Right. This is like this really famous, uh, kind of bug in human thinking, right? Is that if you write it, of course the computer's gonna interpret what you're saying, right? Um, and so students tend to trust their code in a way that professional software engineers never would. and I think because it's coming from this third party that you know is wrong, it's coming from the LLM that can, that can often make mistakes. I think they're gonna be more inclined to actually engage in those testing practices. Uh, kind of knowing about the fallibility of the LLM, [00:14:27] Jeremy: You're shifting the order. I mean, there is test driven development that some people practice, but I feel like probably what's most common is you write the implementation yourself and then, then you'll go and see like, oh, did this thing I, I wrote. Did it do what I thought it should do? Um, whereas this is kind of flipping it, where it's the large language model is gonna write my code, so I'm just gonna start with the test and then I'll ask it to, to write me the code. And maybe that will kind of make test driven development be the default. [00:15:02] Leo: So yeah, I, I, I think that students may wanna engage more in kind of test driven development because they wanna think more about, uh, what exactly should this function be doing? Uh, how should behave, what kind of inputs and output should it expect? And then it can kind of write the prompt to co-pilot or whatever LLM is using, uh, to express those inputs and outputs. Well, they're more apt to get good answer from the LLM and they've kind already got their test cases worked out as well, so they can immediately just go right into the testing agency if the prompt came back right. Using LLMs at the function level instead of a broader scope [00:15:35] Jeremy: And you mentioned writing a prompt to implement a specific function. Have you found that they work well at the function level? But if you try to ask it to build something more broad, that that's kind of when it has problems? [00:15:53] Dan: So, I think in general, LLMs do work best at the function level. We have tried to get it to generate bigger apps, collections of functions, and it can work, but sometimes it does, uh, it does do worse. But also we want students to do the problem decomposition for themselves and break up the problem into individual functions. Even though maybe the LLM could work, uh, with, uh, bigger chunks of code, we want students to do it. And one reason is so that they can customize what they get from the LLM. So, in the book, we have a bunch of examples where you could probably just throw it at the LLM and get an answer and, you know, eventually get it to work. But I think at that point, making changes to it might be trickier than it would be if you knew, uh, the architecture of what you were, what you were building. So in the book, we have a bunch of top-down design diagrams, and we want students to understand what they're building at that level, like at the function level instead of, like we said earlier, instead of like at the token level or the line level. Potential issues with outsourcing high level design to an LLM [00:17:03] Jeremy: And so like in this example, you're thinking more from a, a learning perspective. You want the student to look at the big picture, figure out, okay, what are all the different functions or parts of my application? Break that down and then feed those individually. To, um, these large language models. I, I'm wondering from like, let's say you're a, a professional software engineer and your interest is more in I want to make the thing and less so, in I want to learn how to make the thing. in that case, do you feel like you could feel confident in, in giving the large language model a larger piece of the design, or do you still feel like it's good to have that overall structure done by the, the developer and then just be very targeted about how you use the large language model? [00:18:03] Leo: I think that's a tricky question because we haven't worked with these tools heavily in a professional programming setting. I think often when we're thinking about large design of software, you're gonna be working on teams, talking with other members of the team about the interfaces and things like that. And so I'd be pretty hesitant to to outsource that, that thinking to the, the l lm cuz you, the communication between the teams still has to happen. Uh, even if it weren't for that. Um, I kinda think of it as a probabilities. So essentially whenever you ask co copilot or any of these LMS to, to do a task, the more it has to right, get the kind of more likely it's gonna make a mistake. Um, and so, uh, that's kind of why I like the functional level. It seems like I. Partially because it's not that much code that tends to write. Um, so you help to avoid kinda the probabilistic problem, but also because it's learned on a huge code base that has lots and lots of functions that have been implemented. It tends to do well at that, that solving the function kind of task. [00:19:10] Jeremy: Yeah. And I, I think the way you put it as outsourcing that designer, that decision is, is interesting because yeah, if you are working on a team and whether it's in code review or just in a discussion, often people will ask, well, well, why did you do it this way? Or Why, why is this the, you know, the good way to design it? And if you kind of handed that off to an l l m, maybe your answer is, I don't know. It's just what it it told me, which (laughs) [00:19:39] Dan: Yeah. [00:19:42] Leo: That isn't an answer I want to u use talking to my boss. Right. Well the chat GPT told me I should have it this way. That doesn't seem like a good answer. Choosing GitHub Copilot for CS1 [00:19:50] Jeremy: I think we, we've kind of been talking in more a general sense of working with LLMs and you've mentioned how you're gonna be teaching introductory computer science courses this coming, quarter or semester. And so when you teach these classes, what tools are you gonna recommend your students use? And yeah, maybe you could go into that a bit. [00:20:13] Leo: Absolutely. So we're gonna be recommending, um, At least, at least for my class, I'm gonna be recommending that they use, uh, vs code with copilot. Um, I just like the integration of the IDE with the, uh, interactions with the LLM uh, I think it avoids just a whole bunch of copy pasting from another interface into your IDE to then, uh, run it. I think it also reduces the barrier of them kinda immediately getting the code and then testing it right there in the environment. I'm sure any of the other tools would work, it's just, that seems to have worked well for us, uh, when we were writing the book. And that's, that's actually the technique we recommend in the book as well. Um, so that would be the primary tool for the students writing the code. In addition to having them using copilot with, uh, in the IDE for a lot of the code generation, depending on where things are at with copilot x, um, which is right now, um, available through wait list. Uh, if that's, that's available publicly, I think we're gonna be recommending that because it has a copilot chat feature, uh, which can be really nice to interact with. And, uh, the main use that, that we're gonna be encouraging students to use, whether it be co-pilot chat or a ChatGPT is in just a conversation with the LLM about, particularly modules and libraries. So if you are diving into, merging PDFs, which, uh, Dan did a great job in one of the chapters in our book talking about, if you wanna dive into that, well, what libraries should we be using in Python for that. Uh, and we found that the LLMs do a really good job at this, of actually saying, here are the different libraries you could use. Here are the pros and cons of them. These are the ones that, uh, need to be actually have additional install done. Or these ones that come in with, vanilla Python. they're actually really good at kind of giving you the what you should use for the various libraries. Um, and so that's, that's one other way that we were gonna be encouraging the students to use the LLM. Types of questions to ask the LLM [00:22:07] Dan: Yeah. So whenever the students or the junior programmer, doesn't know how or doesn't think they can, uh, do something in base Python, we have them interact with the chat and, and ask. So another example that comes to mind from the book is we have a chapter writing some games. And so for most games, including the two that, uh, we've got in the book, you need to be able to generate random numbers, right? So how do you do that? And so in the past you would've used a search engine stack overflow or something, and you would've found, some sample code and you would've pasted it in to your file and made variable name changes and things like that. And so what we do now is we ask chat, okay, I need to generate some random numbers. How do I do it? And then it will come back to you with a few options, and then you can systematically work through those options if you like. Uh, and you can ask, okay, is this one built into Python or not? And then it will tell you, oh, this one's not. We don't need to memorize API docs [00:23:11] Dan: And you say, oh, well, okay, so like, how do I install this? And then no, does it work on all OSS or just Windows? Right? So, uh, we guide the reader through these questions that you could have, uh, to help you make a decision. Um, and I think what I like the most about this is not having to learn. APIs, like yet another api. Like I don't, I don't think I have room, you know, in my, like, brain for any more APIs. And, and what's cool is I, I've forgotten like every API that, uh, we've used in the book. So we have like examples of emerging PDFs and, uh, removing duplicate images from directories, uh, from like people's phones, and, and stuff like that. And I don't know, I don't know which library it's using. Uh, and I'm, I'm totally okay with that, right? Like I just, I, I wanted to get the job done. I wanted to write a tool, and the tool got written and it used some sort of library and it worked great. And I didn't have to look through the documentation for that library and figure out like, which functions do I have to call and things like that. So, I, I know it, it can be fun, you know, it could be fun to really learn an API well, but a lot of people, they don't want to program for programming sake. Like, they just wanna get work done, right? So, you know, while I, I, I fully admit to, enjoying programming just for the sake of programming. I do a lot of competitive programming problems just for fun. You know, it's like Sunday morning and it's like, Hey, yeah, I got like an hour and I got an hour to work on something. Let me work on this little competitive programming problem. But, uh, a lot of people, they're not motivated by that. They're motivated by consequences of code. And this is one thing about LLMs that I'm very excited about, is you can just, make a lot more progress, without having to learn what these, people may believe is just useless knowledge, right? Like, does it really matter how I should invoke this api Right, to merge PDF files? I mean, the answer for many people is no. Like, they just want the result to happen. And I love how we can kinda match what they, uh, deem important, right? With the LLMs, it's like a new level of abstraction, for for many people. LLMs make building software possible for more people [00:25:28] Leo: There's a couple of audiences that come to our introductory classes, and what Dan's talking about here is one of the things I'm most excited about with this, and that's the students who come and take just one. Programming class. I know it's probably a different audience than, uh, a lot of the people listening right now. Um, but the people who just take one programming class, it's required for, for their major. They, I just wanted to explore it a little bit, but they, they don't go into this as a, as a career. I think a lot of those students right now, uh, if you ask them a year later to program something, do any of these tasks that we're talking about right now, I doubt they're able to, even if they did really well in that class. Uh, and that's really disappointing, right? If they've taken a programming class, they should be able to, to do something with that, a year or even five years later. And I really believe that if you teach them the skills of interacting with these LLMs, they'll be able to do these tasks later. They'll be able to come back and go, you know, I don't remember any of the Python syntax. I don't remember, uh, even how to get started with this. But you know what, I'm just gonna ask, uh, copilot, how do, how do I go about merging these PDFs, having this directory? And then, uh, the copilot chat comes back and says, oh, you might use this and that. And then they go, oh, I remember, I remember how to, how to write these functions. And I just said, you have to go over a prompt. I think they could really do it. And that, that's a bit of a game changer, right? That means a larger portion of our society will be able to, uh, write code and using a useful way. And I'm just really excited about that. I think it's gonna be really nice, uh, after the changes happen. More people might stick with Computer Science [00:26:58] Jeremy: I can totally see in the context of someone who's, not seeing it as a career, or someone who is like, hasn't done it in a while. It could be. These tools can be incredibly useful, right? Or it can even get you interested in this field at all, right? Like a lot of people, they, they struggle through the syntax and then they decide like, oh, this is not for me. Even though like they had something really cool they wanted to build and, and maybe these kind of tools can, can get them over that hump. [00:27:31] Leo: Exactly. I think there's a population of students, um, and it varies a bit by demographics, who come to computer science, with really the best motives in mind, right? They wanna make their goals in their life are to make the world a better place, and they want to achieve those goals. And if you spend the first three quarters or three semesters working with them and all they're seeing is syntax and they're not actually solving anything meaningful, um, it starts to create this disconnect of what their goals are for their life and what they think the goals of are, are career are. Of course as, as, as a computer science, I wanna say, stick it out. You know, if you, if you go into the fourth, fifth class, you'll start seeing how these are really useful tools that can make society a better place. But it'd be really nice to front load that and have them solving useful problems much earlier and seeing that, uh, computer science, uh, can be used in really nice ways. Efficency can be taught later [00:28:26] Jeremy: And, and so within the, the context of. People who are studying computer science will eventually, who may become professional software developers, things like that. Something more long term where it becomes more of a craft, the, the code that comes back from these large language models. Sometimes it could be something that's like not maybe the most easy to read or it may be doing something inefficiently. And I'm wondering from your perspective how users of these tools should, should think about that and, and recognize when that's a problem. [00:29:06] Dan: We in, in, in the first couple of courses, typically in the CS program, um, we don't spend much time on efficiency. the reason is that there's just so much to learn early on, and, um, we worry about overwhelming people with, know, too much, for them to, to process it at once. And we don't wanna prevent students from becoming interested, by. Giving them all of these requirements early on. So typically we, you know, we push efficiency, down the, down the road into like a data structures course, for example. But your question points to another reason why, we've decided to teach some of the skills we teach early on. So if, if a student, you know, came up to Leo or, or me and said, Hey, you know, like I wanna generate efficient code, how do I do it? My answer would, would be, so like, get, get familiar with programming first, but you are learning the skills necessary where you'll be able to look at code later because you know how to read it still, right? It's not, uh, something that you don't understand. You're gonna, you're gonna know it. We're gonna spend lots of time on code reading, and so later I think we can just teach efficiency the way we always did. Um, so, you know, doing, uh, time complexity analysis on, on the code and they're still gonna understand what the code is doing. So, um, I, I, I don't think this is going to, this is going to change much in, in the earliest courses. LLMs can expose students to different types of code [00:30:35] Leo: To the, to the point about code readability, I might add that, uh, certainly they're gonna get back some, some code that's maybe not the best style and it may not be as readable. Uh, but what's kinda interesting is that students aren't exposed to a lot of different styles kind of in our existing courses, right? They, they see the code that they write and they see the code that the professor writes and gives them, and there's not much else. And so, I mean, we're gonna need data and we're gonna need research to, to, to know this for sure, but it, it, I suspect them seeing lots of different code styles and having to read those different code styles may actually inform them better than we do now about what makes code more readable. Uh, and then they might be able to employ that as they go forward. [00:31:21] Jeremy: And, and when you're saying they're gonna read different styles and things like that, are you referring to code they're gonna see from the LLM or are you talking about them reading just other code bases in their classes or their professional work? [00:31:39] Leo: Oh, I'm sorry. Yeah, I was referring to the code. They'll see from the LLM Right [00:31:43] Jeremy: Oh I see [00:31:43] Leo: LLM will come back in all these different ways. They'll have different styles and they'll, uh, have different approaches to solving it. Right? Sometimes they'll, uh, come back with like this one line Lambda expression thing that solves it, and they'll have no idea how that works. And they'll, they'll ask for a different answer and they'll get, uh, a much more, uh, user-friendly first, uh, first programing experience kind of code back. And they'll be able to understand that and go, okay, this is the kind of code that I wanna see. Not this thing that was completely non-readable. [00:32:11] Dan: Yeah, Leo, I just thought of something. So, uh, so you know, by default you can get it to give you 10, uh, code segments to solve the problem, right? So it'd be kind of cool, if we ask students about each of them, right? Each of the 10, which ones are right, which ones have bugs, which ones have good style, which ones have bad style, it's like a built-in learning opportunity right there. So yeah. [00:32:34] Leo: Oh, it's true. Yeah. And, and so the 10 things that, uh, Dan I was referring to is if you do control, enter in vs code when you're working with a copilot, it'll give you back 10. Possible responses. And you're totally right Dan. You could just say of these 10, how readable are they? Are they right? Um, there's lots of fun things you can do to ask students questions. [00:32:51] Dan: and often many of them are right with just subtly different ways of, of, of, of solving the problem. I mean, I'll, I'll admit to having some fun looking through all of the suggestions just to kind of see what the variability is and when there's a lot of variability. I really like it because, uh, like Leo said, it exposes people to different styles they may not have seen before. And, um, may it may, it may, um, encourage you to ask questions, right? Like, why does this one work? Right? I've tested it. It doesn't look like it should work. Why does it work? I feel like that's the beginning of a pr pretty powerful learning experience right there. [00:33:30] Jeremy: Yeah, that makes sense to me because I, I think about how when a lot of people are doing software development before all these LLMs, they will search on the internet and go, okay, what's an existing answer for this thing I'm trying to do? They'll find a post on Stack Overflow and they'll find the accepted answer and it'll be like, okay, this is it. This is the solution. Whereas, at least in this case, it seems like you can go like, okay, well here's, here's 10, 10 potential solutions, and at least you get a little bit more exposure to, um, what are the different ways you could do it. [00:34:06] Leo: Exactly, and, and it's nice for 'em to see these different options. And I think there is, for professional software engineers seeing that stack overflow post, like, here's the accepted answer, integrating that into your code isn't a big jump for, for a lot of us. Um, but I do wanna stress that for the intro students, it often is a really big jump. Uh, just the, oh, how do I change around this? Oh, this was the interface for this function, but I'm been asked to have this other interface with a function and, and they really can struggle in that domain. And so I think copilot and these LLMs are nice in that they give back answers that are more tailored to the existing code that they're working with, um, and will reduce that barrier of them trying to incorporate the answer. Optimization can come later, most code is straightforward [00:34:50] Jeremy: So it seems kind of overall, when you're talking about people who are using programming in a more professional capacity, the code style and efficiency that will probably be taught very similarly to however it is now, where you basically have to get exposed to different styles and types of code, get exposed to the algorithms and and that will allow you to read the answers you get back better. So the answers you get back from the LLM with the knowledge you gain from these later courses, you'll be able to tell like, oh, okay, this is, this. Level of complexity, or this has like, you know, exponential, performance implications, that kind of thing. [00:35:43] Leo: So I think the performance piece is really important. Um, and I appreciate your, you bringing it up. I think, I'm, I'm kind of curious, uh, uh, what percentage of the time professional programmers are really spent, uh, are spending optimizing, uh, the code that they write? Um, I suspect a lot of the code that's written, uh, is pretty straightforward. Uh, you, you already know how to work with the database you're working with. You already know how to write the queries for that. You're, you're, you're just, uh, you're still doing something that, that's certainly thought provoking, but it's not the hard work of, oh, how am I gonna write design the right algorithm for this to get the exact best runtime? And so I think there are some times that that does matter, but those may be the times that the LLMs aren't as helpful and there's still gonna be a, a pretty big need for programmers who know how to do that, uh, themselves. [00:36:33] Jeremy: Yeah. I mean, I, I think that of course this is gonna vary from industry to industry, but Dan, you were talking about learning APIs and I feel like a lot of jobs are learning APIs and gluing them together. [00:36:49] Dan: Yeah. Um, I would agree, but I wonder what can happen if some of that's automated. Right? So maybe, people who are gluing APIs together will be able to. Get even more done, right? Incorporate even more, APIs in the same amount of time that they've been doing it. Now, I don't, I don't know if that job changes as dramatically as it, it seems, um, I guess there's this tension between people, having to change jobs or become more efficient in the current job. And, you know, obviously I, I hope it's the latter and there is some recent evidence that it could end up being, the latter, just more productive people overall, building, know, bigger software in incorporating more APIs than, than before and, and not overloading yourself. So, we'll, we'll see, you know, how it, how it all, um, how it all turns out. But I'm, I'm hopeful that we'll just be doing our jobs better. Reading code as a skill [00:37:51] Jeremy: In that, that context, sometimes people will say that the, the reading of code and comprehending code can sometimes be more difficult than writing the, the code. And in fact, can sometimes take you more time, like, let's say you've built out a project and now you need to add new features. Well, to add the feature, you have to understand the, the code base that existed before and so. When we talk about LLMs and the context of not programming, but just general writing, people talk about the fact that it's easy to generate more writing, right? We can generate more documents, blog posts, more articles, that sort of thing. And with code, it sounds like it'll be similar, right? Where it'll be easier for us to write more code, generate more code. Um, but I wonder if either of you have thought or, or think it's a concern that we'll be generating so much code that now we'll have so much we won't be able to even have the time to understand all of it, [00:38:55] Leo: I haven't thought much about the generating so much code that you can't understand. I mean, I think if, if we're generating code, I, I'm really hoping someone's testing and making sure it works right and stuff. And so I guess it depends on what kind of, uh, what level of the interface are we, we looking at. Um, but I have thought about a fair bit about the, the, what you described early on in your question, which was. Diving into a big code base, figuring out what needs to be changed and changing it, that is a really common task, especially for like new software engineers, uh, in their, their first jobs. Right. And it is also one that's really well documented in the, the education literature, uh, education literature, uh, that we aren't teaching them to do. Like we almost always are giving them, uh, right, these functions are really well defined or, uh, write the code all from yourself, but we rarely ever give them large code bases to learn from. Now I don't think diving into a large code base and trying to understand how it works is the right thing for like an intro class. And then we're mainly talking about, uh, students first learning your program here. Uh, but I am encouraged that we are teaching code reading as kind of a first level skill when I think current programming courses teach code reading right? In parallel with writing. So a lot of the writing's happening very early before they even know how to read well. Um, and so I think there's some optimism here that if we teach code reading first and make it a core skill, they'll be better set up in the later classes to maybe take on those large projects where they tackle the exact problem you're describing, which is also the exact thing they're gonna have to do when they get to, to their jobs. The amount of code we throw away may increase exponentionally [00:40:37] Jeremy: Yeah, it, it also kind of, I wonder sometimes when you're writing code, you'll write it in a certain way because it's tedious to write a lot of code, right? Like you'll, you'll make something generic in such a way where you can reuse it, and maybe reduce the amount of lines of code. But then when you have something, generate that code, maybe it'll be a solution that. Is a lot more code than you would've written personally, and it works. But, by nature, the fact that it was easy to generate, you chose that solution versus one that, that maybe was more generic and um, had less code. I, I'm not sure if that makes sense, but I'm kind of curious if the use of these models will sort of change maybe how we write code [00:41:30] Dan: I'm kind of wondering if the amount of code we throw away is going to increase exponentially. Because, because, um, you spend time working on something, you're probably gonna keep it. But I, I wonder because, uh, Jeremy, like what you said, it's, it's so easy to generate code now. so I, I've had this thought where, what, not sure how, how, um, how much I believe myself here, but, uh, should we be storing the, the prompt, like not the dot py file, right? Like just store the prompt and then if you do have to regenerate the code later, maybe you gotta make some tweaks or something. You just change the prompt and then, and then rerun it. So, because, because, because code is, um, It's not there yet, but it's, it's becoming free, right? It's becoming, you can generate as much of it as you want. And so I, I wonder how much, how much of it is, so there's, there's a lot of code already that you write once, and you run it once and then, and then you get rid of it or lose it or whatever. And I wonder if that, that practice will increase. So it's like, okay, you know, I wanna do this data analysis. Okay. So you write a prompt, you get some code, you generate some graph, and then you just don't even think about it. You just get rid of it, and then maybe later you want another similar analysis and you just do it again. Right. So I kind of wonder, because there's maybe less ownership now of code, right? You didn't like sweat as much to write the code. So maybe, maybe more of it gets thrown away. [00:43:03] Leo: I, I completely see what you're saying, Dan. So you have the prompt and you had it perform some form of data analysis and you wanna tweak it to do a slightly different data analysis. Uh, I wouldn't go into the, I mean, right now if I wrote the code from scratch, I would go into the code and find that one spot that I need to change and I would tweak it. But if I'm just generating the code, I would just tweak the prompt and then get a new piece of code that does exactly what I want there without having to, to [00:43:26] Dan: yeah. You know, how, how, it can take a, a long time to re-familiarize yourself with a program that you wrote six months ago. You know, it's like, oh, I, I called this variable temp one. Like, what's this for again? Right. you know, maybe, yeah, [00:43:41] Leo: Wait, I think we've all been there. Keeping the prompt instead of the code [00:43:43] Dan: Uh, but yeah, I don't know. It's just, just a thought I've been having. It's like, it, so, so when, when, now when, when I hear people talking about code maintenance, for example, like using, you know, good variable names and consistent style and stuff, in my head I'm thinking, well, you know, is, is the code the artifact now? Is it still the artifact? And right now, you know, of course it is. But, um, but, you know, fast forward a little while, maybe, maybe some of what I just said, uh, sort of becomes true eventually. [00:44:11] Leo: That's getting to perhaps kind a larger issue about what is the interface that we're, we work with as programmers. I've been thinking about this a lot, uh, just because I, I teach my, my background's. I have a PhD in computer architecture, and so I teach the classes that do machine code and assembly code, and they're, they're, they're core classes for computer scientists because you need to know how computers work. And, um, I think that's a core component, understanding that, But we don't start by teaching the students machine code. Like no one wants to learn how to program a machine. Um, at least I can't imagine anyone wanting to learn that. Um, and we've kind of cognitively picked Python or Java right now, the most common two programming language to learn from. Because they're easy to learn, they're easy to, to read. The code tends to be more understandable when you read it. It tends to be a little bit more forgiving when you write it. Um, and so we picked these because we think they're nice interfaces. They're, they're convenient for programmers and they're convenient for, for new learners. And it just seems to make sense that the LLM may be that next step of interface that we start choosing. The, the catch is because it can be wrong. It's not like a compiler. A compiler is deterministic. It's gonna be, uh, shy of that. Maybe one time in your career you find a compiler bug, like the compiler's always right. This time the LLM isn't always right and so I, I'm not sure how this is all gonna play out. Um, you can imagine the LLM as the new interface and all we ever store is, is code prompts and we don't ever even see the code, perhaps as one scenario. And the other is we, we do in fact still interact with the LLMs and still interact after the code. Um, but I think it's too early to kind of know where this is all gonna fall. But, um, we could see some big shifts, I think, in the field over the next few years. [00:45:52] Jeremy: Yeah, I think that's pretty interesting to think about what, what Dan had mentioned where yeah, you could check in your prompt and maybe a set of test cases for the app that's supposed to come out and yeah, maybe that's your alternative to the actual source code. Um, especially for things that, like you were saying, are, are used not that frequently or maybe you only use it once and so the, um, the quality of the actual code is. Maybe less so important in terms of readability and things like that. And as long as you can reliably reproduce that thing, yeah, maybe, maybe that does make sense. [00:46:39] Leo: The reliable reproduction could be the tricky part. And you there may be even saying that you, you start doing where you tag don't, don't try to reproduce this. Like, we actually spend a whole bunch of time on this. It's super optimized. Like, don't think the LLMs gonna give you this answer again. So, uh, keep the code along with the prompt. Keep the code too. Don't, don't scratch that because the LLMs not gonna do better. Um, and then in some cases you're like, yeah, the LLM's gonna do a pretty good job on this and [00:47:07] Dan: Yeah. Leo, maybe we have to Maybe we have to distinguish between code that you can just get out of an LLM no problem. And code that people have spent time working on. I like that. Yeah. Yeah, [00:47:21] Leo: some you're like, hashtag don't change. [00:47:23] Dan: Humans were here. [00:47:25] Leo: exactly. The concerns about relying on commercial tools [00:47:27] Jeremy: Yeah. this is the 30th iteration of this code we generated and we verified that this one's good. So just, just, it's a interesting, interesting future. We, we might be heading into, so, so one thing you, you mentioned a little bit earlier is that the tools that you're gonna recommend to your students, it sounds like it's primarily going to be GitHub copilot and GitHub copilot X for the, the chat interface. And one thing about these tools is these are tools by commercial companies, right? These are tools by OpenAI and Microsoft. They're tools that you have to pay a subscription fee to use. You have to send your code to a commercial server. And I wonder if that aspect concerns you at all. The, the fact that the foundations that our students are learning on is kind of reliant on these companies and these cloud services. [00:48:31] Leo: I think it's an amazing question. Uh, I think to some degree these are the tools that professional software engineers are using, and so we need, there's, there's a bit of an obligation as instructors to teach them the tools that they're gonna be using as professionals going forward. I think right now they're free. Uh, to use for, for education's sake. and so as long as that stays the case, I'm a little, more comfortable with it. If it started to move to a pay model for education, I think there could be some really big problems with equity. and I think it's not just true for, for computer science, but I'll start with computer science. I mean, if it's computer science and we start making it where you would have to pay to get access to these models or use these models, then whether we tell the students they can use it or not, they still can use them. And so there's gonna be some students that, the wealthier students who may have access to these, who are being able to learn better from these, being able to solve better homeworks with these, that's super scary. And you could imagine the same thing for even just K through 12 education, right? If you're thinking about them writing essays for homeworks or anything else, if it's a pay model, then the students who have, uh, the money will pay for it and get access to these tools. And the students who don't, won't. You could imagine the, all these kind of socioeconomic, uh, divides that already exist, only being exacerbated by these tools if they switch to this pay model. Um, so that has me very worried. Um, and there's some real ethical issues we have to think about when we're, we're using them. Yeah. Um, the other ethical issue I kinda wanna mention is just the, the copyright and the notion of ownership. Um, and I think it's important for us as instructors to engage students in the conversation about what it means to create content and intellectual property and how these models are built and what they're building off of. Um, and just engage in that ethical conversation with the students. I don't think we as a society have figured this out. I don't, I think there's gonna be some time both legally and ethically before we have the right answers. but at the very least, you need to talk to the students about, uh, these challenges so they know what's going on and they can engage in the debate. [00:50:45] Dan: Yeah, just to underscore that, Leo, this is the reason we're doing research on the first version of the course that Leo's teaching. We need research on the impact of LLMs, on students. especially, we need to know if students benefit from this, in what ways they benefit. How are these benefits distributed across demographic groups? We have a long and sad history in, computer science of inequities, in who takes our courses, who succeeds in our courses. we're very aware of this and it's, uh, unacceptable to make that situation, uh, worse than it already is. So, um, we're, we're gonna be carefully doing our research on this, uh, first offering of the course. A downside is students might bypass fundamentals [00:51:30] Jeremy: So we've mostly been talking about the benefits of using these tools in classes and in education. we just mentioned the possible inequities if you don't have access to those things, I, I wonder if from either of you, if there are negatives you see to this technology, whether that's the impact on what people learn or in anything else. Like are there downsides you see to the use of this technology? [00:52:04] Dan: Yeah. So in addition to, uh, the important, uh, inequity concerns that, uh, we just talked about, I have a concern about students using the tools in ways that. Don't help them learn the skills we think they need. So it's a, it's a, it's a power tool and you can, uh, you can get pretty far, I think with, without, um, being systematic in, in how you work with it and without testing, without debugging, um, it's, you know, it's, it's kind of magic right now. And so I can imagine, a lot of students just taking off at, you know, a hundred miles an hour. and so I'm one, one of, one of, uh, the things we have to worry about in these initial courses is, convincing students that there really are principles to using this technology. You can't just type something and get an answer and then go party. and, and, and so that, that is one of my concerns. That's one of the negatives. It's super powerful. And, like, like, so before you, you can't just type some Python and make it work and, but now you can sort of type in whatever you want and kind of get something back. and so part of our job as educators is to help students use these tools, in in a way that. Will ensure their long-term success with, with these tools, right? So, I, I'm not saying that they can't just do whatever they want and, and make some of their first assignments work. I, I think they could, I think they could be like un principled with the prompts and just throw it in there and get code and, you know, submit that, submit that code. But, uh, we're, we're going, you know, we're going for longer term, uh, effectiveness here, right? We have students who may not take another CS course. We need to keep them in mind. We have students who are gonna wanna eventually be software engineers, uh, security experts, PhDs in computer science, right? So we have a number of audiences that we're talking about, and we think they all need to know the fundamental skills of programming still. Even though, you know, they have this, this power tool at their expense now. [00:54:07] Leo: Speaking of the fundamental skills for programming, I, because of my, my hardware background, I'm this huge fan of teaching mental models in classes. Like what is the mental model of computation? Like, how, how do you imagine the computer is executing as you write the code? And, uh, ideally a professional computer scientist should be able to take, okay, well this is kind of the, my interpretation, this is my mental model for when I'm working at Python. If I really, really wanna drill this down, I can turn that into assembly. And if I really had to and turn to machine and even think about how this is working within the cash subsystems and virtual memory and all these things, I want 'em to be able to play those things out. We are changing the first class, and I think the first class is gonna be doing some things much better than before, like teaching problem decomposition and things like that. I'll, I'll mention that in a second, but, we are doing some things better. but we may not be teaching at how is the computer working as well. And so you can't just change one course and think the rest of the curriculum's gonna work. And so I think the entire curriculum's gonna need to adjust some, um, in, in a way of just adapting to these LLMs. Rethinking how to assess students [00:55:10] Leo: Um, the second piece for things getting potentially more challenging, uh, is instructors, we're in a good place right now as instructors, uh, in terms of how we assign and grade homework. Um, so grading, uh, this probably isn't gonna be a shock, is not one of our favorite things to do as faculty. I mean, it's actually really important. Uh, it's, it's central to us understanding how our students have learned, but it's generally not the most favorite thing that we do. And what a lot of instructors have done, myself included, is for much the introductory sequence. We have created assignments that can be entirely auto grade. So we define functions incredibly well. Like, write a really good description, this is exactly what it needs to do, and the students write that one piece of code and, uh, whether we like it or not. That is exactly when copilot does very well, and the LLMs do really well. And so the LLMs are gonna solve those very easily already. So we have to fix our assignments just like it, it's a given. Um, but it means that we're probably gonna have to rethink how we do assessment. Um, and so we're probably gonna be writing assignments that are much more open-ended and we're probably gonna have to be grading those, uh, putting more care and time integrating those potentially by hand. Uh, but I think these are all good things for the community and for the field. Um, but you can imagine how it's gonna be a bit of a, a shift for faculty and, and may take some time, uh, to be adopted as a result. [00:56:41] Jeremy: And, and so if you're shifting to homework that is more broad in scope, has more code, needs more human eyes on it, how how does that scale within the educators side? Right. You were, you were talking about how you've got, um, things that could be auto graded before and then now you're letting somebody generate this whole project. How does that work from your end? [00:57:09] Leo: I, I think there's a few things that are at play. Um, we, at, at large institutions like Dan and I are at, we have kind of armies of, uh, instructor assistants, instructional assistants that help us, uh, and so we can engage 'em in, in various tasks. And so, uh, one of the roles they heavily have now is helping students in the labs solve these auto grade assignments. and so you can imagine they will still be in the labs helping the students with these creative assignments, but now they're gonna have to have potentially a larger role in assessing the success of those. Um, there's been some really creative work, uh, in, in assessment and so I'll, I'll, I'll mention a couple of the ones, but there's, I, I'm sure I'm gonna be omitting some. But, uh, one is, Students could complete their project, and then they have to record a short video of them explaining the code that was in their project and how it worked, and you actually assess them on that video and their explanation of the code and how it works. Right? Because those can be perhaps shorter than trying to go through a really big project and, and see how it works. Um, there's a tool out of a UIUC, um, called Prairie Learn that helps with, um, uh, these are still auto graded, but uh, it helps with the, the test setting where you can write questions and have them, uh, graded kinda in a, in a exam or homework setting. the, the neat feature of that is that it can be randomized and so you don't have to worry as much about students kind of leaking information to each other about, test content from quarter to quarter. And so, because the randomization, they have to learn, actually learn the skills, and so you can, um, kinda engage with 'em in these test centers. And so right now a big grading burden on, on faculty is exams. And so you can actually give more exams, give more frequent feedback to the students and with, without the same grading burden. and so that, that's the other kinda exciting assessment piece. [00:59:01] Dan: Current assessment is not effective [00:59:01] Jeremy: In the different types of assessments, like the example of the video you gave, I'm just thinking to myself, well, the person could ask copilot or ChatGPT to give 'em a script, right? And they can rehearse that when they, um, send you a video. [00:59:18] Leo: I think, but I think that's, um, I think this is a philosophical shift in assessment that's kind of been gaining momentum over the years and that's that the assignments are all formative and they should all be. Pretty low stakes and the students should be doing them for the process of learning. and then, and, and it's unfair in some ways. There's a, there's a lot of things right now where you kind of grade them on, were you present at this time? Did you, did you meet this deadline at this time? Which if you're thinking about the, a diverse population of students, like you can imagine like a, a working mother who's also trying to do this, grading them on where you here at this time doesn't feel very equitable to me. And so there's this whole movement for grading for equity that shifts much of the assessment onto the exams. And so, yeah, the students could, uh, find multiple ways to cheat on the homeworks, but that's not the point of the homework and the homework's just to learn. It's a small scale, the grade, so. But you still then have those kinda controlled environments where they're taking these tests and that's where the grade actually comes from. Um, it's gonna take some time to make that shift, at, at, at least at a number of schools, my own included assess that those ho take home assignments are a huge portion of the grade. And students will love that because they can get all this help. And they can, especially with the auto graders, that they don't even write their own test cases. They just use the auto graders, the test cases. Right. Um, which is really depressing. Um, and they go to the, the, the instructional staff. The instructional staff tends to, to give away the answers. That's actually a paper that we, uh, published a few years ago. Um, and so the students love this high stakes, but tons of help version of assessment, but that may not actually measure their, their level of knowledge. And so it's gonna take a little bit of adjustment, for students and for faculty to do the shift, uh, to where the, as assess the, the exams are the Give students something interesting to build and don't worry about cheating [01:01:09] Dan: Yeah. Also, I'm, I'm not convinced that cheating is gonna be a problem here. it's very possible, for example, that students cheat on our previous assignments because the assignments were not authentic. Um, you know, in industry you're never going to, no one's gonna come up to you and say, Hey, like, from scratch, you know, write this exact function, takes two lists and determines, you know, how many values are equal between them. It's like, it's like, that's not gonna happen, right? You're gonna be doing something that has some sort of business purpose. And I kind of wonder, um, and this, this will, you know, this will play out, um, one way or another in the next, in the next, uh, few months. But I kind of wonder if we give students authentic tasks. Now you're cheating yourself right out of doing some, some something of value, right? Like before you were. You were probably cheating yourself out of a learning opportunity, but how, how can, you know, how can students know that? Right. The assignments boring, right? It's like, write all these functions and then something, something happens because of the magic, you know, starter glue code we wrote. So I don't, I don't know. I feel like if you give students opportunities to learn what they want to learn, um, there's, I don't, I don't know. I don't, I just don't think there's a reason to cheat. And, and also, I mean, um, I, I've been much happier in my career recently when I don't worry about it. So it's like, okay, I've got a bunch of students, some of them are gonna cheat, some of them are not. And I'm here to talk to the ones who, who wanna learn. So, I don't know. A lot of people were on some email lists, for example, and a lot of people seem to be panicking about it. And I, I kind of think, you know, buddy, you had a huge cheating problem before. I don't think it's gonna become worse now that you're giving students authentic work to do. Right? They, they all want to be using, uh, you know, programming to, you know, to do their jobs better or make their lives better, or the world better. They don't wanna waste their own time. But if you give them a decontextualized task, it's like, it's super tempting to just cheat, right? Because what's the point? Right? And so, um, I, I, I'm, I'm very hopeful. I, I, I am not convinced that that cheating is gonna be a problem. [01:03:23] Jeremy: Yeah, that's a good point, and I think it's very motivating for any student or anybody who's learning a thing to, to be able to see a clear, connection to like an actual thing that I made, versus I'm writing functions to pass these test cases is like not very, not very interesting, uh, intellectually. So I think if you structure the, the projects where it's like, oh, am I gonna actually make this thing that does this thing That seems pretty cool, then yeah, that's definitely more motivating to, to actually go through with it. [01:04:00] Dan: Like, just off the top of my head, imagine if every student had to make a landing page, like a website who's gonna cheat? Like what? I want a landing page. Like, I, I want that. And, and all student and students are gonna want that too. And so it's like, well, okay. Like I, I, I may as well make it right. Like this has a, this has a purpose. So, Leo. Leo, I'm curious, you've been, you've been, uh, patiently listening to, to that. I'm curious what you think about [01:04:29] Leo: Oh, I, I, I, I can't agree more. I think the, um, I mean, we can leverage the research, right? The computing and context is kinda this well established thing that if you teach computing in a context that's meaningful to the students, they tend to learn more and engage more, and wanna stay in the major more. Um, and I think we're just gonna be able to do, we do this right? We, for convenience sake, and because of the scale of the number of students that we've had in our classes, we've kind of moved away from that and gone to these auto graded nots of exciting assignments. And I think we're, this is the impetus we need to go back to fun, creative, interesting assignments that the students are gonna put time and care into because they want to, not because they have to. Problem Decomposition [01:05:10] Jeremy: So it, it sounds like through our discussion, you're, you're really excited about, bringing large language models into the classroom and kind of what that means for you and your, your students. And I wonder if there's anything we didn't really touch on or maybe something that was unexpected that you think is gonna make a really big difference, to you and your students. [01:05:33] Leo: I think one of the things that we haven't touched on yet that I'm, I'm really excited about is, the piece of problem decomposition. And so over the years, because of this trend towards auto grading, uh, what's happened is, all the cognitive work of taking a, a big, computing task and breaking into smaller pieces, deciding what classes should exist, what functions should exist, all those interfaces, all that work that I think is really interesting and exciting. It is now done for students because the auto grading structure just makes it so you have to have these functions and they just code the functions. and so I think that's really concerning just from a software engineer in perspective, that students are, are learning how to program without learning those core abilities as, as software engineers to take a large problem, break it down, figure out what the right interfaces are, and that's a lot of, that's actually more art than science, I'd argue. And so the more time you have to practice it, the better. And I am incredibly excited that LLMs are kind of forcing our hand to make us step back, give larger programming tasks to them, and teach them the process of problem decomposition explicitly in a way that, in a way that we've never really, never done before. I think that's, uh, that's a good place to, to wrap it up on so if people want to hear more about your upcoming book or maybe even enroll in in your class, Leo, where can they get some more information? I. Both Dan and I have active LinkedIn pages and we're happy to have folks, uh, follow us there. Manning Publishing is the, publisher for our book. Um, and so we have that book out on early access right now. Um, it should be available, uh, entirely electronically by August in time for the start of the fall quarter. Um, and then it should be out in print, uh, shortly thereafter. [01:07:25] Jeremy: Cool. Well, this has been an interesting discussion. I mean, large language models are kind of that's the thing right now. Everybody's trying to, to stuff it into every single product. And I think getting both of your perspective on where it fits in in education has been, has been very interesting. So thank you. Thank you very much for coming on the show [01:07:46] Leo: Thank you Jeremy, for inviting us and for running such great podcast. We really appreciate it. [01:07:52] Dan: Thanks Jeremy.

Jun 14, 2023 • 1h 4min

Anita Zhang and Alvaro Levia on systemd at Meta

systemd is a service manager for Linux. It is the first process that runs on many Linux distributions and manages all other user processes. It includes utilities for logging, process isolation, process dependencies, socket activation, and many other tasks. psystemd is a python library to communicate with systemd over dbus from python as an alternative to shelling out from an application to control services. Anita Zhang is an engineerd managerd at Meta and Alvaro Levia is a production engineer at Meta. I attended their systemd workshop at the Southern California Linux Expo. Topics covered: What's systemd? Giving talks and workshops cgroups and namespaces systemd timers vs cron Migrating from CentOS 6 to 7 Production engineers need to go lower in the stack to debug applications Meta's Linux userspace team Use of public cloud at Meta Meta's bootcamp Pystemd Mastodon Anita Zhang Alvaro Leiva Workshop systemd workshop Conference talks Journey into the Heart of systemd - Scale 19x Systemd: why you should care as a Python developer - PyCon 2018 Move Fast without Breaking things - Scale 18x Solving All the Problems with systemd - LISA18 Using systemd to high level languages - All Systems Go! The Curious Case of Memory Growth - Scale 19x Related Links systemd psystemd systemd-run systemd-timers Transcript You can help edit this transcript on GitHub. Introductions [00:00:00] Jeremy: So today I'm talking to Avaro Leiva and Anita Zhang. Avaro is the author of the pystemd library and he's a production engineer at Meta. And Anita is an engineerd managerd at Meta, and I'll let her explain that further. [00:00:19] Jeremy: But thank you both for joining me today. [00:00:21] Anita: Yeah, thanks for having us. [00:00:24] Jeremy: I guess where we could start, Anita, maybe you could explain a little bit your, your title that I just gave you there. engineerd managerd [00:00:31] Anita: Yeah, so by default I, I should be a software engineering manager, but when I transitioned to management, I was not, Ready to go public with, um, my transition. So I kind of hid it by, changing the title. we have some weird systems in place that grep on like the word engineer. So I had to keep engineer in there somehow. and so I kind of polled my friends what I should change my title to, and they're like, oh, you're gonna support the systemd team, so you should change it to like managerd. So I was like, sounds good. engineerd, managerd. I didn't wanna get kicked out of any workplace groups, for example, that required me to be an engineer. [00:01:15] Jeremy: Oh, okay. [00:01:17] Anita: Or like engineering function, I guess. [00:01:19] Jeremy: Yeah. Yeah. And you just gotta title it yourself, so as long as you got engineer in it, you're good. [00:01:24] Anita: Yeah, pretty much. Some people have really fun titles like Chief Potato Officer and things like that. [00:01:32] Jeremy: So what groups does the, uh, the potato officer get to go in? [00:01:37] Anita: Yeah. Not the C level ones. (laughs) What's systemd? [00:01:42] Jeremy: I guess maybe to, to start, we should explain to people who aren't familiar, uh, what systemd is. So if either of you wanna wanna take that one. [00:01:52] Alvaro: so people who doesn't know, right? So systemd is today is your init system, right? Is the thing that manage your, your process. and the best way to understand this, it is like when your computer, it needs to execute something. And that's something is what we call pid one. And that pid one is the thing that is gonna manage everything from now from there on, right? Uh, in the most basic level, if you remember how to, how does program start, how does like an idea becomes a program? Uh, you need to fork exec, right? So that means that something has to be at the top of that tree and that is systemd. now that can be anything, right? So there was a time where that was like systemv and there was also like upstart, uh, today's systemd is the thing that, uh, it's shipped in most distributions. [00:02:37] Jeremy: Yeah, because I, I definitely remember when I first started working with Linux, uh, it was with CentOS 6, and when I would want to run a service, I would have to go and write a bash script and kind of have all these checks for, is this thing running? Does it have permission to these things, which user is it running as, and so there was a lot of stuff that I remember having to do before systemd came out. [00:03:08] Alvaro: The good old days as we call them, [00:03:11] Jeremy: Or the bad old days. [00:03:13] Anita: Yeah. Depending on who you ask. [00:03:15] Alvaro: Yeah. So, so that is super interesting because, um, During those time, like you said, you have to write a first script. That means that you were basically yourself, your own service manager, right? So ideas as simple as, is my program running? There was no real answer. You have to figure it out, right? So if you run a program, uh, you maybe would create a pid file which hold the p or the pid of the process, of the main process, right? And then something needs to check, oh, is this file exist? Does the file exist and does the content of this file actually match to a process? And then you grab the process. So it was all these ideas that you had to do, and then for, you have to do it for every single software that you would deploy on your machine, right? That also makes really hard to parallelize stuff, right? Because you have no concept of dependencies. So if your computer has to put, uh, I, I dunno if you remember like long time ago, like Linux machine would, takes like five minutes to boot like your desktop. I remember like openSUSE. I can't remember, like 2008, 2007. Uh, it would take like five minutes to boot and then Ubuntu came and, and it start like immediately. And it was because, you can parallelize things, but you cannot do that if all you're running are bash script. Why was systemd chosen to be included in Linux distributions? [00:04:26] Jeremy: I remember before the Linux distributions didn't include it. And I wonder if you have any insight into how systemd got chosen to be the thing to manage our processes and basically how we got to where we are today. [00:04:44] Anita: I mean, we can kind of speculate a little bit. at the time when Lennart started systemd, um, with. Kai Sievers probably messed up his name there. Um, they were all at Red Hat and Red Hat manages fedora these days and I believe fedoras kind of like the bleeding edge for a lot of the new software ideas. Um, and when they picked up systemd as the defaults, um, eventually it started to trickle down to the rest of their distributions through RHEL and to CentOS and at the same time, I think other distributions started to see how useful it was in terms of managing all the different processes and services. Um, I know Debian at one point had kind of a vote on like whether they should make systemd either default or like, make it easy to switch between both. And then they decided to just stick with systemd because it's, I mean, the public agrees that it's like easy to use and it's more useful. It abstracts away a lot of things that they had to manually do before Who is interested in systemd? Who comes to your talks and workshops? [00:05:43] Jeremy: Something I've been kind of curious about. So just this year at SCaLE uh, you ran a, a workshop teaching people how to use systemd and, and sort of what it is about. I guess when, when you get people coming to these workshops, what are they typically, where are they typically coming from? Are they like system administrators or are they software developers? Like when you run these workshops, who are you looking for as your audience? [00:06:13] Alvaro: To be fair, this was the first time that we actually did a workshop for this. But we have like, talk about this in, in many like conferences. here's what happened, right? So every time that you put systemd on the title of, uh, of a talk, you are like baiting people into coming in, right? Because you do want to hear like some people who are still like reluctant from that war that happened like a few years ago. Between systemd and Ups tart right? most of the people who we get are, I would say like, software engineers, people who do software, and at least the question that I always get a lot, it is like, why should I care about systemd um, if I run everything on my containers in my Docker containers, right? The other type of audience that you get, you do get system administrators. Uh, but in general those people only cares about starting and stopping services don't really care about like the, like the nice other features that systemd has to offer. And then you get people who just wanna start like flame wars and I'm here for them. Why give talks and workshops on systemd? [00:07:13] Jeremy: In previous years, you've given conference talks and, and things like that related to systemd. And I wonder for, for both of you where, where the, the interests came from, where this is something that you feel strongly enough about that you wanna give talks about it. Because it's like, a lot of times when people give a conference talk, it's about, like new front end technology or some, you know, new shiny thing. Whereas systemd is like, it's like very valuable, but it's something that I feel like a lot of people don't think about. And so I'm just kind of curious where the interest came for, for both of you. [00:07:52] Anita: I think I just like giving talks and teaching in general. So if I have work that I found really exciting or interesting, then I'd want to like tell people about it and like teach them and like show them something cool. I think systemd is kind of a really good topic in that case because a lot of people want to learn more about it. Today there's like lots of new developments going on in systemd. So there's like a lot of basic stuff that you can learn, but also a lot of new advanced topics that are changing every year as well. aside from that, there's also like more generally applicable things. Like everyone wants to know how to debug something if you're like a software engineer or developer or even a sysadmin. Um, so last year I did a debugging talk. there's a lot of overlap I'd say how about you Alvaro? [00:08:48] Alvaro: For me, it, my interest in systemd started in, back when I was working on Instagram, we needed to migrate from CentOS6 to CentOS7. and that was the transition where you would have like a random init system to systemd, right? So we needed to migrate all of our scripts from like shell script to whatever shell script is going to interact with systemd. And that's when I was like, I don't like this. So I also have a thing where if I find something that doesn't have an Python API for it, I go and create a Python api. So I, I create pystemd like during that time. And I guess for me, the first reaction was when I was digging up systemd was like, whoa, can systemd do that? Like, like really, like I can like manage, network firewalls, right? Can I, can I stop my service from actually accessing the internet without having to deal with iptables at the time? So that's kind of like the feeling that I wanted to show people when I, when we do these these talks and, and these workshops, right? It's why like most of our talks, eh, have light demos in them because we do want to show people like, Hey, like, this is real. You can use it. [00:09:55] Jeremy: I don't know if this was a conscious decision on your part, but the thing about things like systemd is they, they feel like more foundational things that don't change that quickly. Like if you look at front end development, for example, at at meta you've got React, and that ecosystem changes so often that it's like there's always this new thing, you learn the way to do it and then it changes, right? Whereas I feel like when you're in the Linux user space and you're with systemd, like they're adding new things, but the, the. Foundations kind of stay the same. I'm not sure if that sounds accurate to both of you. [00:10:38] Anita: Yeah, I'd say a lot of the, there are a lot of stable building blocks in systemd, but at Meadow we also have a kernel team, which is working on like new kernel features all the time. They take years possibly to adopt, but with systemd, if we're able to influence the community and like get those kernel features in earlier, then like we can start to really shape what the future of operating systems look like. So it's not, it's very like not short term, uh, work that we're doing. It's a lot of long term, uh, work. [00:11:11] Jeremy: Yeah, that's, that's interesting in that I didn't even think about the fact that you are sitting at the, the user level with systemd, but you kind of know what you want. And so if there's things that the kernel can do to support that, you're having that involvement. With the open source community, make sure that you have your, your say get put in there. Yeah. [00:11:33] Anita: Mm-hmm. [00:11:35] Alvaro: It, it goes both way, right? So one part it is like, yeah, sure, we want features and we create them. Um, and we actually wanted to those to be upstream because we like, one thing that you should, you should never do is manage internal patches for like, things like the kernel, because that's rebase hell. Um, but you also want to be like part of the community and, and, and, and get the benefit of like, being part of it. Who should care about systemd? [00:11:59] Jeremy: And so, like one thing you mentioned ear earlier, Alvaro, is that people will sometimes ask you, I'm running my application in, in Docker containers. Why do I care about systemd? So, so maybe you could explain like, how you would respond to that. Yeah. [00:12:17] Alvaro: Well for more, for most people who actually run their application container I'd say like, no, you probably shouldn't care. Right? Like, you're good where you are. But in general, like, like system is foundational in the sense that it is the first thing that your computer boots your computer doesn't boot off of Docker or Kubernetes or, or any like that. So like something has to run these applications. there's also like a lot of value is that not all applications exist in the vacuum. Like, uh, like let me give you an example. Like if you have a web server, When people are uploading stuff to the web server, you will upload temporary things and then you have to clean it up after a while. So you may want to take advantage of systemd timers or cron or, or whatever you want, right? While the classical container view is that your pid one of the container is the application that you're running, right? So you do want to have like this whole ecosystem, Not all companies can run on containers. not everything can run in containers. So that's basically where all the things start to, to getting into shape. There's a lot of value in understanding how programs actually like exist, right? With the thing that I told you at the beginning of how an idea becomes a program understanding like, like you hit, you are in your bash, right? And you hit ls Star full enter, right? What happened in your machine? Understanding all the things, uh, there is a lot of value and understanding how systemd works. It's, it, it provides, uh, like that knowledge for you. [00:13:39] Jeremy: So for the average engineer at Meta who is relying on your team to deploy their, their code, I guess, if that's the right term, do you think that they're ever needing to think about systemd or is that kind of more like the responsibility of your team and they're just worried about like, I put my thing into my container and I don't, I don't worry about it. [00:14:04] Anita: I think there's like a whole level of the stack that sh ideally we should not even care or know that we're running systemd below them. I think that's, say we're doing our job well, cuz then the abstraction is good enough that they don't have to worry about it. But there's like a whole class of engineers below that that have to, you know, support the systems that run our on bare metal and infrastructure and make it happen. And those are the people who really care about what we're putting in systemd or like what the corner cases are and things like that. [00:14:37] Jeremy: Yeah, that, that makes sense. I mean, one of the talks that was at SCaLE was, uh, Brian Cantrill um, he gave a talk about the forgotten operator, and he was talking about how people forget that there are actual servers behind all the things we're deploying to, right? [00:14:55] Anita: Mm-hmm. [00:14:55] Jeremy: There is a person that you're racking the machines and plugging the power, and like, even though there's all these abstractions in front, that still exists. And so it sounds like things happening at the kernel level and the Linux user space and systemd that's also true because all this infrastructure that people are using to deploy their software on your team is the one who has to keep that running and to keep that running, they need to understand, uh, systemd and, and all these foundational Linux pieces. Yeah. [00:15:27] Anita: Mm-hmm. Yeah. [00:15:29] Alvaro: Like with that said um, I, and maybe it's because I'm very close to to, to the source. Um, and, and you know, like, like I said, like when, when all your tool is a hammer, everything looks like a nail? Well, that hammer for me, a lot of the times it is like even like cgroups or, or namespaces or even like systemd itself, right? there is a lot of times where, um, like for instance, a few years ago we have not, like, like last year or something, uh, we had an application that was very was very hard to load, right? It used a lot of memory. And so we start with, with a model where we would load like a, like a parent process and then child process would deal with, with, um, with the actual work of the thing, the classical model of our server. Now, the thing is that each of the sub process that would run would need to run, uh, on a separate set of privileges, right? So it would really need to run as different users. And that was like very easy to do. But now we actually wanted to some process to run with a, with only view of the file system while the parent process actually doesn't have to do that, right? Uh, or we want to limit the amount of CPU that a child process would use. So like all of these things, we were able like to, to swap it out uh, with using like systemd and, and, uh, like, like a good, Strategy for like, you create a process, you create a new cgroup, you put that into the cgroup, you create the namespace, uh, you add this process into that namespace, and then you have like all this architecture, and it's pretty free because forking it's free in general. [00:17:01] Anita: Actually, Alvaro's comment reminded me of like why we even ended up building the systemd team in the first place. It's kind of like if we have all these teams trying to touch cgroups on their own or like manage processes on their own, they're all gonna do it a different way and not, all of them will be ideal or like, to put it bluntly, I guess, we're really aiming to try and provide like a unified, really good foundational experience, for the layers above us. And so, systemd and the other things that go into the operating system are a step to get there. What are cgroups and namespaces? [00:17:40] Jeremy: And so for someone who's not familiar with the concept of cgroups or of namespaces, could you kind of give like a brief description? [00:17:50] Anita: so namespaces are, uh, we're talking about the kernel feature where, um, there are different ways to isolate, uh, different resources to the process or like, so that they have their own view of certain things, the network or, the processes and things like that. Um, and Cgroup stand for control groups. It's, at meta we only use Cgroups v2 which is a way to organize your processes into, Kind of like a directory view. but processes will be grouped into different, folders, shall you say, but that allows you to, uh, manage the resources between different groups of processes, which is how systemd does its services. [00:18:33] Alvaro: So a, a control group will allow you to impose restrictions on how each system uses the resources, right? So with a cgroup, you can say, only use 20% of cpu, and the, and the kernel will take care of that. Uh, while namespace it is basically how you view the system around you. So like your mount directory like, like where does your home points to? that's, I would say it's more on the namespace side of things. So one is the view then one is the actual, the restrictions. And like Anita said, like systemd does a very clever thing. It doesn't have two, is not the. It's not why cgroups exist, but every time that you start a systemd service, systemd will create a cgroup for that service and will put every process in that cgroup, even though all cgroups would end up being the same, for instance. But eh, you can now like have a consolidated list of what process belong to a service. So a simple question like, like what services has my Apache web service started? That's show you how old I am. But yeah, you can answer that now because you just look at the cgroup, you don't look at the process tree. [00:19:42] Jeremy: So it, it sounds like the, the namespacing is maybe more for the purposes of security, like you said, giving you a certain view of your, your system. and the cgroups are more for restricting resources, but also, like you said, being able to see what are all the processes, um, are associated. Um, so that you, you don't have a process that spins up other processes and then you don't know who owns those, and then you don't know how to shut 'em all down. That, that takes care of that for you. [00:20:17] Alvaro: So I, I always reluctant to use the word security or privacy. I would like to use the word isolation. Yeah. And then if people want to impose the idea of security and privacy to those, that's fine, but it's, but it's mostly about isolation. [00:20:32] Anita: Yeah. Namespaces are what back all the container technologies are. Anytime you run things in a container, it's probably using some kind of name spacing. But yeah, you, you kind of hit the nail in the head. Isolation versus like resource control [00:20:46] Alvaro: As Anita just said that's what fits on containers, uh, namespaces and cgroup like a big mix of those. But that doesn't mean that the only reason why those things exist are for containers. You can take advantage of those technologies without actually having to think of a container. systemd timers vs cron [00:21:04] Jeremy: Something you had mentioned a little bit earlier is, is how systemd has other features and one of them was, was timers. And I was kind of curious, cuz you said you could, you wanna schedule a job, you can run it using cron or you can run it using systemd timers. And it, I feel like whenever I see people scheduling jobs, they're always talking about cron but, but not so much about systemd timers. So I was curious if you had any thoughts on that. [00:21:32] Anita: I don't know. I feel like it's used pretty interchangeably these days. Um, like even when people say cron they're actually running a systemd timer with the cron format, for their time. [00:21:46] Alvaro: So the, the advantage of of systemd timers over cron is, is basically two, right? The first one it is that, you get more control on the time, right? So you have monotonic and absolute times, right? Which is basically like, you can say like this, start five minutes after the previous run. Or you can say this, start after five minutes after the vote, right? So those are two type of time, that is the first one, uh, which may be irrelevant for most people, but that's it. Uh, the other one is that you actually have advantage over the, you take full advantage of systemd, right? In current you say run this process, right? And how that process run, it's basically controlled by the process itself, right? So if you, uh, like if the crontab is for the user, that's good for you, but if you want to like nice it or make it use less cpu, that's what it is. Well, with systemd you say, This cron will start the service and the service, you take full fledged advantage of all the things a service can do. [00:22:45] Jeremy: From what I could tell, looking at the, the timers api, it, it felt like it would be a lot easier to kind of see when things ran, get, you know, get a log of, I ran this time job and it, it failed. Um, it seemed like systemd had a lot more kind of built in to, to kind of look into that. but, uh, yeah, like Anita was saying, like when you, you hear kind of cron all the time, but like you said, maybe it's, maybe they're not actually using cron all the time. They're just saying cron [00:23:18] Alvaro: Well, I would say this for cron like the, the time, the time, uh, syntax for it, it's pretty, it's pretty easy to understand, even though I never remember where, I remember where weekday is, right? The fourth, which one is which? [00:23:32] Jeremy: I, I'm with Anita. I need to look it up whenever I'm gonna use it. (laughs) [00:23:36] Anita: Yeah. I use a cron translator when I have to use cron format. [00:23:41] Alvaro: This is like, like a flags to tar, right? Like, I never remember which, which flags to put. [00:23:48] Anita: Yeah, that's true. [00:23:50] Alvaro: We didn't talk about this, we haven't talked about systemd-run, but one of the advantages of the, one of the advantages of using timers is that you can schedule them on demand, right? So like cron if you wanna schedule something over time, you need to modify the cron the cron file. Uh, and that's, it's problem right? With systemd, you can have like ephemeral units and so you can say like, just for now, go and run this process five hours from now. Like, and after that, just forget about it. [00:24:21] Jeremy: Yeah, the, during the workshop you mentioned systemd-run and I hadn't even heard of it. And after I saw that I was like, wow, this, this could be really useful. [00:24:32] Alvaro: It is quite useful. How have things changed at meta? [00:24:34] Jeremy: One of the things you had mentioned, I, I guess you've, you've been at Meta for, for quite a while and you were talking about how you started with having all these scripts you were running on CentOS 6 and getting off of that to something more standard. I wonder if you could speak a little bit to that, that process. Like what did things look like then and, and how have they they changed over the years? [00:25:01] Alvaro: I would say the following thing, right? Like Anita said, like for most engineers, the day to day of things don't really change that much, because this is foundational things, right? So if you have to fundamentally change the way that you run applications every couple of years, then you waste a lot of time, right? It's not the same as you say, like react where, or, or in the old days, angular where angular one, angular two, angular three, and then it's gone, right? Like, so, so I, I would say it like for the average engineers things don't change that much, uh, for the other type of engineers, like, like us who we, who that we really care about, like how things run. like having a, an API where you can like query the state of your service. Like if like asking like, is my service running with an API that returns true or false, that is actually like a volume value that you can like, Transferring in your application, uh, that, that helps a lot on, on distributed systems. a lot of like our container infrastructure that we use internally at Meta is based on a lot of these ideas and technologies. [00:26:05] Anita: Yeah, thinking back to the CentOS 6 to 7 migration, I wasn't on like the any operating systems team at the time, but I was working with them and I also was on a team that had to migrate, figure out how to migrate our scripts and things over. so the one thing that did make it easy is that the OS team, uh, we deploy all our things using Chef. Maybe you've heard like Puppet and Ansible, that's our version, the Open Source Chef code. Um, and they wrote some really good documentation on how to migrate, from Runit, which is what we were using before to systemd. it was. a very large scale effort across multiple teams to kind of make sure their stuff works, do the OS upgrade and then get used to using systemd. [00:26:54] Jeremy: And so the, the team who is performing this migration, that's not the product team. That would be the, is it production engineering? Is that, is that what you called that? [00:27:09] Alvaro: So, so I was at the other side of, of that, of that table where I, the same as Anita, we were doing the migration more how most things work at Facebook is that it's a combination of the team that is responsible for the technology and the teams who uses the technology. Right. So we are a company, so we. Can like, move together. it's the same thing when you upgrade kernels. Most of the time the kernel team will do the effort to upgrade the kernels, and when they hit a roadblock or something, they will call for the owner of the service and the owner of the service can help debug uh, for the case of CentOS 6 and CentOS 7, eh, I was the PE at Instagram P Stand for Production Engineer. I was the PE at Instagram who did most of the migration of our fleet. So I, I rewrote most of the things because I understand how our things work, and the OS team provide like the support to understanding like, like when can I use some things, when can I use not other things. There was the equivalent of ChatGPT at those days, right? I was just ask them how to do stuff. They will gimme recipes. so, so it it's kind of like, like a mix, uh, work, uh, between those two teams. Uh, Anita, maybe you can talk a little bit about what you talk when you were upgrading the version of systemd and you found a bug? [00:28:23] Anita: Oh, the, like regular systemd upgrades nowadays? I, I'd say it's a lot easier these days. I mean, since the, at the time when we did the CentOS 6 to 7 migration, it was like, our fleet was a lot more fragmented. I'd say nowadays it's a lot more homogenous, which makes, which makes it easier. yeah, in the early versions there were some kind of obscure like, interactions with the kernel or like, um, we, we make pretty heavy use of systemd to run our container system. So, uh, if we run into any corner cases, um, like pretty obscure stuff sometimes, because we make pretty heavy use of the resource control properties. we usually those end up on the GitHub tracker, things like that. [00:29:13] Alvaro: That's the side effect of hiring very smart people. They do very smart things that are very hard to understand. (laughs) [00:29:21] Jeremy: That's kind of an interesting point about you, you saying you're using these, these features, you know, of the kernel very heavily because, you're kind of running your own infrastructure, I think even your own data centers, so you're kind of forced to go to this level, it sounds like just because of the sheer number of services you're running and the fact that like, you have to find a way to pack 'em all onto the same machine. Does that, does that sound right? [00:29:54] Anita: Yeah, I'd say at, at our scale, like it's more cost effective to act, own the servers and run all everything on it ourselves versus like, you know, leasing from, uh, AWS or something, which we've also explored in the past. But that also means we need more engineers to build and run things on our servers. [00:30:16] Jeremy: Yeah. So the, the distinction between, let's say you're a, a small company or a mid-size company and you pay AWS or, or Google to, to do your hosting for you, then you may not necessarily get exposed to a lot of the, the kernel level problems or even the Linux user space problems because you're, you're working at a higher level and that's why you don't necessarily encounter those kinds of things. [00:30:46] Anita: I'd say not, not necessarily. I think, once you get even like slightly lower in the stack where you're just like on your own server, Then you will want to start really looking into like what systemd's doing, how does it interact with other, uh, services, um, on your server, and how can you like connect these different features together? [00:31:08] Alvaro: One of the things that every developer who who works like has to worry about is log right, and that, and that's the first time that you actually start interacting with systemdata available, right? So you have to understand, like maybe it's not just tail /var/log foo, but log right. Maybe it's just journalctl and it's like, what? But yeah. [00:31:32] Jeremy: Yeah. That's a good point too about whenever you're working with the operating system, like you're deploying onto a Linux machine. Regardless of the distribution, if you're the person who's responsible for that, you, you need to know this stuff. Right. Otherwise it's kind of like, you're just putting stuff out there and hoping for the best. Yeah. [00:31:54] Alvaro: Yeah. There, there's also another thing that, I dunno if I've said this before, but, a lot of the times you don't have to know these technologies, but knowing them will help you do your work better. [00:32:05] Jeremy: Yeah, totally. I mean, I think that that applies to pretty much anything in, in development, right? I, I've heard often that some people will say, you take the level that you work at currently and then kind of just go down one level. Right. And then, so you can kind of see what's underneath that. And you don't necessarily need to keep digging, cuz eventually if you keep digging, you're getting into, you know, machine instructions and whatnot. But, um, Yeah, maybe just one level is, is good to, to give you a better sense of what's happening. Production engineers need to go lower in the stack to be able to debug applications [00:32:36] Alvaro: Um, every time that I, that I, that somebody ask me like, what is the difference between a PE and a SWE, uh, software engineer, production engineer, typical conference, uh, one of the biggest difference that I, that I say is that a PE would tends to ask a lot of questions going the same thing that you're saying, we're trying to go down the stack, right? And I always ask the following question, eh, do you know how time dot sleep is implemented? Right? Do you like, like if you, if you were to see time dot sleep on your Python program, like do you actually know what is doing under the hood, right? Is it a while true? While the time, is it doing a signal interrupt? Is it doing a select on a file descriptor with a timeout? Like what is it doing? would you be able to implement it? And the reason why I say this, because like when you're debugging an application, like somebody something's using your cpu, right? And then you see that line on your code, you. You can debug every single line of your code. But also there's a lot of value to say like, no time.sleep doesn't cause CPU to spike. Right. Because it's implemented in a way that it would not be possible to do that. Meta's linux user space team [00:33:39] Jeremy: Another thing that I think might be kind of interesting to talk about is, so Meta has this Linux user space team. And I, I wonder like including your role in it, but just as a whole, like what does that actually mean day to day? Like, what are the kinds of problems people are facing that, a user space team would be handling? [00:34:04] Anita: Hmm. It's kind of large cuz now that the team's grown out to encompass a few other things as well. But I'll focus on the Linux user space part. the team started off, on the software engineering side as the systemd developer team. So our job was really to contribute to the community. and both, you know, help with, problems and bugs that show up in upstream, um, while also bringing in new features, that we think would be useful both at Meta and to like, folks, in the Linux community as a whole. so we still play a heavy role in, systemd. We also support it, uh, within the fleet, like we roll out new releases and things like that. but we're also working on a few other projects in. User space. Um, BP filter is one of them, which is, uh, how can we convert like IP tables and network filtering, into BPF programs. Um, on the production engineering side, they focus a lot on, the community engagements. So in addition to supporting CentOS they also handle, or they like support several packages in Fedora, Debian and other distributions, really figuring out how we can, be a better member of the open source community, and, you know, make connections there and things like that. [00:35:30] Jeremy: And, and what was your, your process for getting in involved with this team? Because it sounded like maybe it either didn't exist at the start, or it was really small and, and now it's really, really grown. [00:35:44] Anita: So I was kind of the first member of like the systemd team, if you would call it that. Um, it spun out of containers. So my manager at the time, who's now my director, was he kind of made a call out on workplace looking for people who'd be willing to, contribute to systemd. He was, supporting the containers team at the time who after the CentOS 7 migration, they realized the potential that systemd could have, making their jobs a lot easier when it came to developing the container backend. and so along with that, they also needed someone to help, you know, fix bugs, put in new features and things that would, tie into the goals of the containers team. Um, and eventually now our host management team, I was the first person who reached out to him and said, Hey, I wanna give this a try. I was on the security team at the time and I always had dreams of going back into like, operating systems development and getting better at it. So yeah, that's kind of how I ended up in this space. A few years later, he decided, Hey, we should build a team and you should like hire some people who will also do this with you and increase our investments in systemd. so that's how we kind of built out the Linux user space team to encompass systemd and more like operating system, projects. Working on the internal security team vs the linux userspace team [00:37:12] Jeremy: And so when you were working on the security team before, was that on software internal to meta or were you also involved with, you know, the open source, user space side as well? [00:37:24] Anita: That was all internal at the time. Which was kind of a regret because there was a lot of stuff that I would've liked to talk about externally. But I think, moving to Linux user space made me realize like, oh, there's so much more potential in open source projects, in security, which is still like very closed source from our side. [00:37:48] Jeremy: And, and so like in your experience, what have been some of the big differences? I mean, definitely getting to talk about it is a big one. but like in terms of your day-to-day, what are the big differences between working on something internal versus something that that's open source? [00:38:04] Anita: I have to talk more with external folks. we're, pretty regular members of like the systemd like conclave sync that we have with the other upstream maintainers. Um, Oh yeah. There's a lot more like cross company or an external open source community building that we have to do. it kind of puts into perspective like how we manage our time and also our relationships versus like internally, like everyone you work with works at Meta. we kind of have, uh, some shared leadership at the top. it is a little faster to turn around, um, because, you know, you can just ping people on work chat. But the, all of the systems there are closed source. So, um, there's not like this swath of people outside that you can ask about when it comes to open source things. [00:38:58] Jeremy: You can't, can't look in, discord or whatever for questions about, internal meta infrastructure to other people. It's gotta be. all in the same place. Yeah. [00:39:10] Anita: Yeah. And I'd say with like the open source projects, there's a lot of potential to tap into, expertise and talent that just doesn't exist internally. That's what I found really valuable, cuz people have really great ideas outside as well. Um, and we should like, listen to them and figure out how to build that into their systems and also ours Alvaro's work at meta [00:39:31] Jeremy: And, Avaro, I don't know when you first started, was that on internal, infrastructure and tooling as well? [00:39:39] Alvaro: Yeah, so, um, my path is different than Anita and actually my path and Anita doesn't share any common edges. so I, I don't work at the user space or the Linux kernel or anything. I always work in teams adjacent to it. Uh, but. It's always been very interesting to know these technologies, right? So I started working on Instagram and then I did a lot of the work in containers in migrations at where, where we build psystemd and also like getting to know more about that technologies. We did, uh, a small pilot on using casync which is a very old tool that like, it's only for the fans, (laughs) it's still on systemd repository, I dunno if that's used or anything, but it was like a very cool idea of how to distribute images. Uh, and in Instagram we do very fast deployments. So we deploy, or back then we used to deploy the source code, of Instagram every seven minutes, right? So every seven minutes, every time that a developer did commit to master, uh, we pushed that into production in less than an hour and we did that every seven minutes. So we were like planning to, to use those technologies for that. Um, And then I moved to another team inside of Meta, which is called Cloud Foundation, where we do a lot of like cloud infrastructure, uh, like public cloud. Uh, that's the area, that is very much not talked much about. but I keep like contributing to, to like this world. never really work on, on, on those teams inside of Meta. [00:41:11] Jeremy: So I guess it's your, your team is responsible for working with the engineers who work on product to be able to take their code and, and deploy it. And it's kind of like you work in combination with the user space team or the systemd team to make sure that what you're doing can be supported by them. Is that kind of an accurate description? [00:41:35] Alvaro: Yeah, that's, that's, that's definitely not an exhaustive description, but yeah, that's the, we, we, we do that. Public cloud at meta [00:41:42] Jeremy: It's interesting that you're, you're talking about public cloud now. So when you move to public cloud, are you using VMs kind of like you would in a data center, or is it, you're actually looking at the more managed services and things like that? [00:41:57] Alvaro: So I'm gonna take a small detour and say like, something that is funny. When I got hired by Facebook, we were working on Instagram. So Instagram was just an acquisition for, for, for meta right. And Instagram ran on AWS. So why wasn't the original team who were moving stuff from AWS into the internal data centers at Meta? On the team that I work now, uh, we work to support workloads that cannot run on meta infrastructure either for legal reasons, or for, for practical reasons. Right, because we don't have the hardware, uh, capability or legal resource because the government ask us, like, this cannot be on, on your data center or security, right? We don't wanna run this, this binary that we don't understand on our network. We do want to work in isolation. and the same thing that Anita was saying, where their team are building the common ways of using these tools, like systemd, and user space. we do the same thing, but for using cloud technologies. So in a way that is more similar to meta. So that's the detour now the, to answer your actual question, uh, we do a potpourri of things, right? So since we manage infrastructure and then teams deploy their code, they are better suited to know how their code, gets to run. Uh, with that said, we do have our preferred ways of how you would run stuff. and it's a combination of user containers, uh, open source containers, and and also like VMs There's a big difference between VMs and meta and in public cloud [00:43:23] Jeremy: So it, it sounds like in this case, you're, you're still using VMs even in public cloud, so the way that you do deployments, the location is different, but the actual software and infrastructure that you're running is, is similar. [00:43:39] Alvaro: So there's there's a lot of difference. Between the two things, right. So, the uniformity of hardware at Facebook, or our data centers, makes deploying things very simple, right? while in, in the cloud, you first, you don't get that uniformity because everybody like builds their AMIs as, as they want to build it. But also like a meta, we use, one operating system, in the cloud, you are a little bit more free of what you want. And one of the reasons why you want to go to the cloud is because you can run stuff on. On, on, on the way that that meta will run. Right? So, so even though we have something that are similar, it's not as simple like, oh, just change your deployment from like this data center to like whatever us is one think you would run. [00:44:28] Jeremy: Can, can you give an example of something where you wouldn't be able to run it on Meta's, image that they would choose to go to public cloud to run a different image for? [00:44:41] Alvaro: So, um, so in, in general, like if the government ask us, like, this is not necessarily like, like the US government, right? So, and like if the government ask us like, hey, like you need to keep this transaction on, on our territory, right? for logs, for all the reasons, for whatever, right? like, and, and we wanted to be in the place, we would have to comply. And that's where we will probably use this, this kind of technologies security is another one that is pretty good. And the other one, it is like, in it general, like, like, uh, like disaster recovery, right? If, if meta is down in a way where we cannot communicate with each other using metas technologies, right? Like you would need to have like a bootstrap point. [00:45:23] Jeremy: Is, is it the case where you are not able to put, uh, meta's image up into public cloud? Because you were, The examples you gave was more about location, right? Where you're saying we need to host in public cloud because it needs to be in this country, but then I think you were also saying the, the actual images you would use on AWS right. Would be. I don't know, maybe you'd be using Amazon Linux or maybe you'd be using a different, os entirely. And is that mainly because you're just not able to deploy the same images you have, uh, in-house? [00:46:03] Alvaro: So in, in, in general, uh, this is kind of like very hard to to explain, but, but, uh, if, if we would have to deploy code to a, machine and that machine would, would, would be accessed by people who are not like meta employees and we have no way of getting them to sign NDAs then we would not deploy meta code into that machine. Uh, because that's Sorry. No, not Pi PI's personal information. I was, uh, ip, sorry, that's that's the word. Yeah. Yeah. [00:46:31] Jeremy: So, okay. So if there's, so if you're in public cloud, there's certain things that you just won't put there just because. Those are only allowed to run on Metas own infrastructure. [00:46:44] Alvaro: Yeah Meta's bootcamp [00:46:44] Jeremy: Earlier you were talking about Instagram was an acquisition and they were in AWS were, were you there at the time or you joined, after? [00:46:54] Alvaro: No, I joined. I joined after I joined to, to meta. The way that Meta does hiring, at least for my area, is that you get hired as a production engineer, but you don't get assigned to a team. So you go through a process called boot camp where you get to try different teams and figure out what things you like. I try a couple of different teams, turns out that I like it to work at the Instagram. [00:47:15] Jeremy: And so at that time they were already running on Facebook's internal infrastructure and they had migrated off of AWS [00:47:24] Alvaro: We were on the process of finishing that migration. [00:47:28] Jeremy: So by the time you were there, yeah. Basically get, getting everything out of AWS and then into meta's internal. [00:47:35] Alvaro: Yeah. And, and, and everything is, is a very hard terms to, to define. Uh, I would say like, like most of all, like the bulk of things we were putting it in inside, like, at least what we call our Django servers. Like they were all just moving into internal infrastructure. How Anita started [00:47:52] Jeremy: This kind of touches on the, the whole boot camp thing, but, Anita, I saw that you, you interned at Facebook and then you took a position there, when you ended up taking a position, I'm kind of curious what were the different projects you looked at or, or how did you end up settling on the one you chose? [00:48:11] Anita: Yeah, I interned, um, and I joined straight out of university. I went into bootcamp similar to Alvaro and I got the chance to explore several different teams. I knew I was never gonna do UI that was just like not my thing. Um, so I focused, uh, my search on all like backend infrastructure teams. Um, obviously security, uh, was one of them because that's the team I was in interning on. Um, I also explored, the kind of testing infra team. we call it sandcastle. It runs our internal like unit tests and things. and I also explored one of the, ads infrastructure backend teams. so it was mainly just, you know, getting to know the people, um, seeing which projects appealed to me the most. Um, and then, you know, I kind of chose based on that, I, I think I've always chosen. My work based on how interesting the project sounded, uh, which has worked out in my favor as far as I could tell. How Alvaro started [00:49:14] Jeremy: How, how about, you Alvaro what were the, the different projects you looked at when you first started? [00:49:20] Alvaro: So, As a PE you do have a more restrictive, uh, number of teams that you can, that you can join. Uh, like I don't get an option to work in ui. Not that I wanted, but, (laughs) I, I, it's, it's so long ago. Uh, I remember I did look at, um, at MySQL as a team, uh, that was also one of the cool team. Uh, we had at that time, uh, distribute, uh, engine, uh, to, to run work, like if, like celery or something like that. But internally, I really like the constable distribute like workloads, um, and. I can't remember. I think I did put, come with the Messenger team, that I, I ended up having like a good relationship with their TL their tech lead, uh, but never actually like joined that team. And I believe because she have me have a, a PHP task and it was like, no, I'm not down for doing PHP [00:50:20] Jeremy: Only Python. Huh? [00:50:21] Alvaro: Exactly. Python. Python. Because it's just above C level. Psystemd [00:50:27] Jeremy: I mean related to that, you, you started the, the psystemd project. And so I wonder if you could explain what the context behind that was. Like what sparked I need to make this, this library? [00:50:41] Alvaro: So it's, it's a confluence of two things. The first one, it is like, again, if I see something that doesn't have a Python API for it, I. Feels the strong urge to create one. I have done this a couple of times, mostly internally, but also externally. that was one. And when, while we were doing the migration, I, I, I honestly, I really hate text processing. So the classical thing was like, if you wanna know if your application's running, you do systemctl, you shell out to systemctl status, then parse the output, find the, find the status column. Okay. And I didn't like that. And I start reading about like, systemd uh, and I got in contact with the or I saw like the dbus implementation of systemd. And that was, I thought that was a very interesting idea how that opened all the doors. Right? Uh, so I got a demo working like in a couple of hours. and then I said like, okay, now how do we make this pythonic? And then I created that and I just created, again, just for migrating Instagram. That was the idea. Then, uh, one of the team members who work with Anita, but also one who doesn't work with us anymore, they saw this and said like, Hey, like this looks like a good thing to open source it. So it was like, sure, like I'm happy to opensource it. So we opensource it and then we went to all System Go, which is a very nice interesting conference that happened in Berlin where like all the head for like user space get together. and, and I talk about it and people seems to like it, and that's the story of that. [00:52:15] Jeremy: And so this was replacing, I guess, like you were saying, a lot of people were shelling out and running cat commands and things like that from their Python scripts. And this was meant to be a layer on top of that. [00:52:30] Alvaro: Yes. So it, it does a couple of things. So first of all, inspecting the processes or, or like the services, getting that information out. That's one of the main usage. But also like starting or stopping or like doing all that operations that you want to do. Uh, knowing the state of, of, of services, uh, that's also another thing that people take advantage of. The other thing that people take advantage of is to modify the status of the, of the processes at runtime, like changing properties, like increasing or decreasing the CPU threshold. because systemd provides a very nice API or interface to modify the cgroups properties that otherwise you would need to kind of understand the tree structure that, uh, that, that whatever. so that's what people tend to use this mostly internally. [00:53:23] Jeremy: And so it, it sounds like at least on the production engineering side, you're primarily working in, in Python. is that something that's the teams before were using Python and so everybody just continues using Python? Or is there kind of like more structure or thought put into that? [00:53:41] Alvaro: I would say the following thing about it, um, like in in general, uh, there's, there's not a direction on which language you should use. It's pretty natural which language you should use, but with without said, there's not a Potpourri of languages inside of, of meta. most teams use c c plus plus Python and rust and that's it. There's go, that appears every once in a while there. Sorry, I should not talk about this like, like, or talk like this about this, but eh, there are team who are actually like very fond of go and they use it and they contribute a lot to that space. It's just not. That much, uh, use internally. I have always gravitated towards Python. That has been the language that teach me how to do real coding. and that's the language that got me a job at meta. So I tends to work mostly on that. Yeah. [00:54:31] Anita: Hey, you forgot hack Alvaro. Our web services. (laughs) [00:54:37] Alvaro: Yes. Yes. Uh, so I would say like, the most used language at Meta is actually PHP it's just like used by, by one particular product. That, that is the Facebook product. Yes. So our, our entire web interface, eh, or web stack uses a combination of hack, which is a compiled php, which is better than uncompiled php, also known as vanilla php. Uh, there is a lot of like GraphQL, React, and, I think that's it. [00:55:07] Anita: Infrastructure is largely like c plus plus Python, and now Rust is getting a huge following as well. [00:55:15] Alvaro: Yeah. Like, like Rust. Rust is, I I would say it's the fastest growing language inside, inside of Meta. And the thing is that there is also what you call like the bootstrap problem. Um, there's like today, if I wanted do my python program and I have a function that fails one every three times, I can add a decorator that is retry, that retries every time that something fails for a timeout, right? And that's built in and it's there used and it's documented. And I can look at source code that uses this to understand how, how works. When you start with a new language, you don't get the things. So people have to build them. So there's the bootstrap problem. [00:55:55] Jeremy: That's also an opportunity as well, right? Like if you are the ones building sort of the foundations, then you, you have an opportunity to be the ones who have the core libraries that people are, are using every day. Whereas if a language has been around a while, it's kind of, some of that stuff is already set, right? And you may or may not like the APIs, but that's what people use. So that's what we, that's what we do. One of the last things I'd kind of like to ask, so Anita, you moved into management in just the last year or two or so, and I'm kind of curious what your experience has. Been like, was that a conscious decision where you wanted to go from engineering, uh, software engineering to management? Or maybe you could talk a little bit to that. [00:56:50] Anita: Oh man, it hasn't even been a year yet. I feel like so much time has passed already. Uh, no, I never had any plans to go into management. I love being an engineer. I love being in the code. but, I'd say my, my current manager and uh, my director, you know, who hired me into the Linux user space team, kind of. Sold me a little bit on the idea of like, Hey, if you wanna like, keep pushing more projects, you wanna build out the team that you wanna see working on these things, um, you can consider going into management, taking it slow in a, what we call a T L M role, which is like a tech lead manager, role where you kind of spend some time doing development, and leading the team while also supporting, the engineers as a manager doing the hiring and the relationship building and things that you do in management. so that actually worked out quite well for me, despite Alvaro shaking his head at first. I really enjoyed being able to split my time into kind of the key projects that I really wanted to work on, um, while also supporting the engineers and having them build out, um, New features in systemd and kind of getting their own foothold in the community as well. but I'd say like in the past few months, it's been pretty crazy. I, I probably naively thought that I'd have a little more control over, I don't know. My destiny has a manager and that's like a hundred percent not true. (laughs) Um, you're, you are kind of at both the whims of your engineers and also the people above you. And you kind of have to strike that balance. But, um, my favorite part still, just being able to hide the nasty stuff away from the engineers, let them focus on their work and enjoy what engineers wanna do best, which is just like coding, designing, and like, you know, doing fun, open source stuff. [00:58:56] Alvaro: I will say like, Anita may laugh about me for, because like she's on the other side, but one thing that I least I find very cool at Meta is that managers are not seen as your boss. Right? They're still like a teammate who just basically has a different roles. This is why like when you're an engineer, you can transition to be a manager and that's, it's not considered a promotion that's considered like a, a like an horizontal step and vice versa, you can come back, right. from a manager into, into like an engineer. Yeah. [00:59:25] Jeremy: That was what I would say. And, uh, I guess when you were shaking your head, I'm guessing this means you, you don't wanna become a manager anytime soon. [00:59:35] Alvaro: So I, I never closed the door on that, but I was checking my head to the work of a tlm. Right. Uh, so the tlm TL stands for Tech Lead and m stands for manager. so you're basically both, but with the time of only one. So, uh, Anita was able to pull it off. I don't think I would be able to pull up like, double duty on that. [00:59:56] Anita: Yeah. Unfortunately I support too many people now to do the TL stuff as deeply as I used to, but I still have find some time to code a little bit here and there. [01:00:09] Jeremy: So you were talking a little bit about how things have been crazy the last few months. If, if someone is making the transition into management, like what are the kinds of things that you would tell them to, to look out for or to be aware that's coming? [01:00:27] Anita: Um, when I, before I transitioned, I talked to a lot of managers about like, oh, what was like, you know, the hardest part about management. And they all have kind of their own horror story about what happened to them when they transitioned or even like, difficult things that happened to them during management. I'd say don't expect it to be easy. you're gonna make a lot of mistakes usually in like the interpersonal relationship side, and it's really just about learning how to learn from your mistakes, pick back up and do better next time. I think, um, you know, if people like books, the Making of a Manager by Julie Jo, she was a designer, and also a manager, at then Facebook. She's no longer here. but she has a really good book on like what you can expect when you transition into management. the other thing I'd say is don't go into management without having a management chain that you can really trust. I'd say that can kind of make or break your first few years as a manager, whether you'll enjoy it or not, or even like whether you'll be able to get through the hard times. [01:01:42] Jeremy: Good point. Yeah. I mean, I think whenever you take on anything new, right? Having the support of the people above you or just around you as well is like, that makes such a big difference, right? Even like the situation can be bad, but if everyone is supportive, then you can, you can get through it. [01:02:02] Anita: Yeah, that's absolutely right. [01:02:04] Jeremy: I think that's a good place to wrap up unless either of you have anything else that you thought we should have talked about. so if people want to check out what you're working on, what you're up to, um, how can they find you? [01:02:20] Anita: well, I guess we're both on matrix now. Uh, I'm Anita Zha on Matrix, a n i t a z h A. we both have Twitters as well. If you just search up our names. Nope. Yeah, you're on Twitter. Yeah. [01:02:36] Alvaro: There is an impostor with my name, right? Actually it's not an impostor. It's just me. I just never log into Twitter anymore. [01:02:40] Anita: We both have Mastodon now as well? Yes. Fosstodon we're both frequently at conferences as well. what's, what's coming up next? I think it's, uh, devconf cZ in the Czech Republic. and then, uh, all systems go in September. [01:02:57] Alvaro: You sent something in Canada? [01:03:01] Anita: Oh, yeah. L F F L F S M M B P F is coming up. That's a, that's more of a kernel conference, though. [01:03:09] Alvaro: An acryonym that is longer than the actual word. Yes. Yeah. [01:03:12] Jeremy: That's a lot. That's a lot of letters. [01:03:14] Anita: It's a, it's a mouthful. (laughs) [01:03:18] Jeremy: That's very neat that you get to, to kind of go to all these different conferences and, and actually get, to meet the people in, in person that are, you know, working with the same things you are and, get to be in the same room. I think that's a, that's a real privilege. Yeah. [01:03:35] Anita: Yeah, for sure. [01:03:38] Jeremy: All right. Well, Anita and Alvaro, thank you so much for chatting with me today. [01:03:43] Alvaro: Thank you for hosting. [01:03:45] Anita: Yeah. Thanks for the opportunity. This is a lot of fun.

Jun 14, 2023 • 1h 16min

David Cramer on Application Monitoring with Sentry

David Cramer, Co-founder and CTO of Sentry, discusses application monitoring with Sentry, treating performance problems as errors, identifying common problems in applications, challenges with front-end applications, and the evolution of Sentry's architecture.

Mar 2, 2023 • 1h 20min

Luca Casonato on Deno

Luca Casonato is the tech lead for Deno Deploy and a TC39 delegate. Deno is a JavaScript runtime from the original creator of NodeJS, Ryan Dahl. Topics covered: What's a JavaScript runtime How V8 is used Why Deno was created The W3C WinterCG for server-side JavaScript Why it's difficult to ship new features in Node The benefits of web standards Creating an all-inclusive toolset like Rust and Go Deno's node compatibility layer Use cases for WebAssembly Benefits and implementation of Deno Deploy Reasons to deploy on the edge What's coming next Luca Luca Casonato @lcasdev Deno Homepage Deploy Showcase Subhosting Fresh web framework The anatomy of an Isolate Cloud Deno Users Netlify Edge Functions Deno at Slack GitHub Flat Data Shopify Oxygen Other related links Cache Web API V8 (JavaScript and WebAssembly engine) TC39 (JavaScript specification group) Web-interoperable Runtimes Community Group (WinterCG) Cloudflare Workers (Deno Deploy competitor) How Cloudflare KV works CockroachDB (Distributed database) XKCD Standards Comic Transcript You can help edit this transcript on GitHub. [00:00:07] Jeremy: Today I'm talking to Luca Casonato. He's a member of the Deno Core team and a TC 39 Delegate. [00:00:06] Luca: Hey, thanks for having me. What's a runtime? [00:00:07] Jeremy: So today we're gonna talk about Deno, and on the website it says, Deno is a runtime for JavaScript and TypeScript. So I thought we could start with defining what a runtime is. [00:00:21] Luca: Yeah, that's a great question. I think this question actually comes up a lot. It's, it's like sometimes we also define Deno as a headless browser, or I don't know, a, a JavaScript script execution tool. what actually defines runtime? I, I think what makes a runtime a runtime is that it is a, it's implemented in native code. It cannot be self-hosted. Like you cannot self-host a JavaScript runtime. and it executes JavaScript or TypeScript or some other scripting language, without relying on, well, yeah, I guess it's the self-hosting thing. Like it's, it's essentially a, a JavaScript execution engine, which is not self-hosted. So yeah, it, it maybe has IO bindings, but it doesn't necessarily need to like, it. Maybe it allows you to read the, from the file system or, or make network calls. Um, but it doesn't necessarily have to. It's, I think the, the primary definition is something which can execute JavaScript without already being written in JavaScript. How V8 and JavaScript runtimes are related [00:01:20] Jeremy: And when we hear about JavaScript run times, whether it's Deno or Node or Bun, or anything else, we also hear about it in the context of v8. Could you explain the relationship between V8 and a JavaScript run time? [00:01:36] Luca: Yeah. So V8 and, and JavaScript core and Spider Monkey, these are all JavaScript engines. So these are the low level virtual machines that can execute or that can parse your JavaScript code. turn it into byte code, maybe turn it into, compiled machine code, and then execute that code. But these engines, Do not implement any IO functions. They do not. They implement the JavaScript spec as is written. and then they provide extension hooks for, they call these host environments, um, like environments that embed these engines to provide custom functionalities to essentially poke out of the sandbox, out of the, out of the virtual machine. Um, and this is used in browsers. Like browsers have, have these engines built in. This is where they originated from. Um, and then they poke holes into this, um, sandbox virtual machine to do things like, I don't know, writing to the dom or, or console logging or making fetch calls and all these kinds of things. And what a runtime essentially does, a JavaScript runtime is it takes one of these engines and. It then provides its own set of host APIs, like essentially its own set of holes. It pokes into the sandbox. and depending on what the runtime is trying to do, um, the weight will do. This is gonna be different and, and the sort of API that is ultimately exposed to the end user is going to be different. For example, if you compare Deno and node, like node is very loosey goosey, about how it pokes holds into the sandbox, it sort of just pokes them everywhere. And this makes it difficult to enforce things like, runtime permissions for example. Whereas Deno is much more strict about how it, um, pokes holds into its sandbox. Like everything is either a web API or it's behind in this Deno name space, which means that it's, it's really easy to find, um, places where, where you're poking out of the sandbox. and really you can also compare these to browsers. Like browsers are also JavaScript run times. Um, they're just not headless. JavaScript run times, but JavaScript run times that also have a ui. and. . Yeah. Like there, there's, there's a whole Bunch of different kinds of JavaScript run times, and I think we're also seeing a lot more like embedded JavaScript run times. Like for example, if you've used React Native before, you, you may be using Hermes as a, um, JavaScript engine in your Android app, which is like a custom JavaScript engine written just for, for, for React native. Um, and this also is embedded within a, like react native run time, which is specific to React native. so it's also possible to have run times, for example, that are, that can be where the, where the back backing engine can be exchanged, which is kind of cool. [00:04:08] Jeremy: So it sounds like V8's role, one way to look at it is it can execute JavaScript code, but only pure functions. I suppose you [00:04:19] Luca: Pretty much. Yep. [00:04:21] Jeremy: Do anything that doesn't interact with IO so you think about browsers, you were mentioning you need to interact with a DOM or if you're writing a server side application, you probably need to receive or make HTTP requests, that sort of thing. And all of that is not handled by v8. That has to be handled by an external runtime. [00:04:43] Luca: Exactly Like, like one, one. There's, there's like some exceptions to this. For example, JavaScript technically has some IO built in with, within its standard library, like math, random. It's like random number. Generation is technically an IO operation, so, Technically V8 has some IO built in, right? And like getting the current date from the user, that's also technically IO So, like there, there's some very limited edge cases. It's, it's not that it's purely pure, but V8 for example, has a flag to turn it completely deterministic. which means that it really is completely pure. And this is not something which run times usually have. This is something like the feature of an engine because the engine is like so low level that it can essentially, there's so little IO that it's very easy to make deterministic where a runtime higher level, um, has, has io, um, much more difficult to make deterministic. [00:05:39] Jeremy: And, and for things like when you're working with JavaScript, there's, uh, asynchronous programming [00:05:46] Luca: mm-hmm. Concurrent JavaScript execution [00:05:47] Jeremy: So you have concurrency and things like that. Is that a part of V8 or is that the responsibility of the run time? [00:05:54] Luca: That's a great question. So there's multiple parts to this. There's the part, um, there, there's JavaScript promises, um, and sort of concurrent Java or well, yes, concurrent JavaScript execution, which is sort of handled by v8, like v8. You can in, in pure v8, you can create a promise, and you can execute some code within that promise. But without IO there's actually no way to defer time, uh, which means that in with pure v8, you can either, you can create a promise. Which executes right now. Or you can create a promise that never executes, but you can't create a promise that executes in 10 seconds because there's no way to measure 10 seconds asynchronously. What run times do is they add something called an event loop on top of this, um, on top of the base engine and that event loop, for example, like a very simple event loop, for example, might have a timer in it, which every second looks at if there's a timer schedule to run within that second. And if it does, if, if that timer exists, it'll go call out to V8 and say, you can now execute that promise. but V8 is still the one that's keeping track of, of like which promises exist, and the code that is meant to be invoked when they resolve all that kind of thing. Um, but the underlying infrastructure that actually invokes which promises get resolved at what point in time, like the asynchronous, asynchronous IO is what this is called. This is driven by the event loop, um, which is implemented by around time. So Deno, for example, it uses, Tokio for its event loop. This is a, um, an event loop written in Rust. it's very popular in the Rust ecosystem. Um, node uses libuv. This is a relatively popular runtime or, or event loop, um, implementation for c uh, plus plus. And, uh, libuv was written for Node. Tokio was not written for Deno. But um, yeah, Chrome has its own event loop implementation. Bun has its own event loop implementation. [00:07:50] Jeremy: So we, we might go a little bit more into that later, but I think what we should probably go into now is why make Deno, because you have Node that's, uh, currently very popular. The co-creator of Deno, to my understanding, actually created Node. So maybe you could explain to our audience what was missing or what was wrong with Node, where they decided I need to create, a new runtime. Why create a new runtime? (standards compliance) [00:08:20] Luca: Yeah. So the, the primary point of concern here was that node was slowly diverging from browser standards with no real path to, to, to, re converging. Um, like there was nothing that was pushing node in the direction of standards compliance and there was nothing, that was like sort of forcing node to innovate. and we really saw this because in the time between, I don't know, 2015, 2018, like Node was slowly working on esm while browsers had already shipped ESM for like three years. , um, node did not have fetch. Node hasn't had, or node only at, got fetch last year. Right? six, seven years after browsers got fetch. Node's stream implementation is still very divergent from, from standard web streams. Node was very reliant on callbacks. It still is, um, like promises in many places of the Node API are, are an afterthought, which makes sense because Node was created in a time before promises existed. Um, but there was really nothing that was pushing Node forward, right? Like nobody was actively investing in, in, in improving the API of Node to be more standards compliant. And so what we really needed was a new like Greenfield project, which could demonstrate that actually writing a new server side run. Is A viable, and b is totally doable with an API that is more standards combined. Like essentially you can write a browser, like a headless browser and have that be an excellent to use JavaScript runtime, right? And then there was some things that were I on top of that, like a TypeScript support because TypeScript was incredibly, or is still incredibly popular. even more so than it was four years ago when, when Deno was created or envisioned, um, this permission system like Node really poked holes into the V8 sandbox very early on with, with like, it's gonna be very difficult for Node to ever, ever, uh, reconcile this, this. Especially cuz the, some, some of the APIs that it, that it exposes are just so incredibly low level that like, I don't know, you can mutate random memory within your process. Um, which like if you want to have a, a secure sandbox like that just doesn't work. Um, it's not compatible. So there was really needed to be a place where you could explore this, um, direction and, and see if it worked. And Deno was that. Deno still is that, and I think Deno has outgrown that now into something which is much more usable as, as like a production ready runtime. And many people do use it, in production. And now Deno is on the path of slowly converging back with Node, um, in from both directions. Like Node is slowly becoming more standards compliant. and depending on who you ask this was, this was done because of Deno and some people said it would had already been going on and Deno just accelerated it. but that's not really relevant because the point is that like Node is becoming more standard compliant and, and the other direction is Deno is becoming more node compliant. Like Deno is implementing node compatibility layers that allow you to run code that was originally written for the node ecosystem in the standards compliant run time. so through those two directions, the, the run times are sort of, um, going back towards each other. I don't think they'll ever merge. but we're, we're, we're getting to a point here pretty soon, I think, where it doesn't really matter what runtime you write for, um, because you'll be able to write code written for one runtime in the other runtime relatively easily. [00:12:03] Jeremy: If you're saying the two are becoming closer to one another, becoming closer to the web standard that runs in the browser, if you're talking to someone who's currently developing in node, what's the incentive for them to switch to Deno versus using Node and then hope that eventually they'll kind of meet in the middle. [00:12:26] Luca: Yeah, so I think, like Deno is a lot more than just a runtime, right? Like a runtime executes JavaScript, Deno executes JavaScript, it executes type script. But Deno is so much more than that. Like Deno has a built-in format, or it has a built-in linter. It has a built-in testing framework, a built-in benching framework. It has a built-in Bundler, it, it like can create self-hosted, um, executables. yeah, like Bundle your code and the Deno executable into a single executable that you can trip off to someone. Um, it has a dependency analyzer. It has editor integrations. it has, Yeah. Like I could go on for hours, (laughs) about all of the auxiliary tooling that's inside of Deno, that's not a JavaScript runtime. And also Deno as a JavaScript runtime is just more standards compliant than any of the other servers at Runtimes right now. So if, if you're really looking for something which is standards complaint, which is gonna like live on forever, then it's, you know, like you cannot kill off the Fetch API ever. The Fetch API is going to live forever because Chrome supports it. Um, and the same goes for local storage and, and like, I don't know, the Blob API and all these other web APIs like they, they have shipped and browsers, which means that they will be supported until the end of time. and yeah, maybe Node has also reached that with its api probably to some extent. but yeah, don't underestimate the power of like 3 billion Chrome users. that would scream immediately if the Fetch API stopped working Right? [00:13:50] Jeremy: Yeah, I, I think maybe what it sounds like also is that because you're using the API that's used in the browser places where you deploy JavaScript applications in the future, you would hope that those would all settle on using that same API so that if you were using Deno, you could host it at different places and not worry about, do I need to use a special API maybe that you would in node? WinterCG (W3C group for server side JavaScript) [00:14:21] Luca: Yeah, exactly. And this is actually something which we're specifically working towards. So, I don't know if you've, you've heard of WinterCG? It's a, it's a community group at the W3C that, um, CloudFlare and, and Deno and some others including Shopify, have started last year. Um, we're essentially, we're trying to standardize the concept of what a server side JavaScript runtime is and what APIs it needs to have available to be standards compliant. Um, and essentially making this portability sort of written down somewhere and like write down exactly what code you can write and expect to be portable. And we can see like that all of the big, all of the big players that are involved in, in, um, building JavaScript run times right now are, are actively, engaged with us at WinterCG and are actively building towards this future. So I would expect that any code that you write today, which runs. in Deno, runs in CloudFlare, workers runs on Netlify Edge functions, runs on Vercel's Edge, runtime, runs on Shopify Oxygen, is going to run on the other four. Um, of, of those within the next couple years here, like I think the APIs of these is gonna converge to be essentially the same. there's obviously gonna always be some, some nuances. Um, like, I don't know, Chrome and Firefox and Safari don't perfectly have the same API everywhere, right? Like Chrome has some web Bluetooth capabilities that Safari doesn't, or Firefox has some, I don't know, non-standard extensions to the error object, which none of the other runtimes do. But overall you can expect these front times to mostly be aligned. yeah, and I, I think that's, that's really, really, really excellent and that, that's I think really one of the reasons why one should really consider, like building for, for this standard runtime because it, it just guarantees that you'll be able to host this somewhere in five years time and 10 years time, with, with very little effort. Like even if Deno goes under or CloudFlare goes under, or, I don't know, nobody decides to maintain node anymore. It'll be easy to, to run somewhere else. And also I expect that the big cloud vendors will ultimately, um, provide, manage offerings for, for the standards compliant JavaScript on time as well. Is Node part of WinterCG? [00:16:36] Jeremy: And this WinterCG group is Node a part of that as well? [00:16:41] Luca: Um, yes, we've invited Node, um, to join, um, due to the complexities of how node's, internal decision making system works. Node is not officially a member of WinterCG. Um, there is some individual members of the node, um, technical steering committee, which are participating. for example, um, James m Snell is, is the co-chair, is my co-chair on, on WinterCG. He also works at CloudFlare. He's also a node, um, TSC member, Mateo Colina, who has been, um, instrumental to getting fetch landed in Node, um, is also actively involved. So Node is involved, but because Node is node and and node's decision making process works the way it does, node is not officially listed anywhere as as a member. but yeah, they're involved and maybe they'll be a member at some point. But, yeah, let's. , see (laughs) [00:17:34] Jeremy: Yeah. And, and it, so it, it sounds like you're thinking that's more of a, a governance or a organizational aspect of note than it is a, a technical limitation. Is that right? [00:17:47] Luca: Yeah. I obviously can't speak for the node technical steering committee, but I know that there's a significant chunk of the node technical steering committee that is, very favorable towards, uh, standards compliance. but parts of the Node technical steering committee are also not, they are either indifferent or are actively, I dunno if they're still actively working against this, but have actively worked against standards compliance in the past. And because the node governance structure is very, yeah, is, is so, so open and let's, um, and let's, let's all these voices be heard, um, that just means that decision making processes within Node can take so long, like. . This is also why the fetch API took eight years to ship. Like this was not a technical problem. and it is also not a technical problem. That Node does not have URL pattern support or, the file global or, um, that the web crypto API was not on this, on the global object until like late last year, right? Like, these are not technical problems, these are decision making problems. Um, and yeah, that was also part of the reason why we started Deno as, as like a separate thing, because like you can try to innovate node, from the inside, but innovating node from the inside is very slow, very tedious, and requires a lot of fighting. And sometimes just showing somebody, from the outside like, look, this is the bright future you could have, makes them more inclined to do something. Why it takes so long to ship new features in Node [00:19:17] Jeremy: Do, do you have a sense for, you gave the example of fetch taking eight years to, to get into node. Do you, do you have a sense of what the typical objection is to, to something like that? Like I, I understand there's a lot of people involved, but why would somebody say, I, I don't want this [00:19:35] Luca: Yeah. So for, for fetch specifically, there was a, there was many different kinds of concerns. Um, one of the, I, I can maybe list two of them. One of them was for example, that the fetch API is not a good API and as such, node should not have it. which is sort of. missing the point of, because it's a standard API, how good or bad the API is is much less relevant because if you can share the API, you can also share a wrapper that's written around the api. Right? and then the other concern was, node does need fetch because Node already has an HTTP API. Um, so, so these are both kind of examples of, of concerns that people had for a long time, which it took a long time to either convince these people or, or to, push the change through anyway. and this is also the case for, for other things like, for example, web, crypto, um, like why do we need web crypto? We already have node crypto, or why do we need yet another streams? Implementation node already has four different streams implementations. Like, why do we need web streams? and the, the. Like, I don't know if you know this XKCD of, there's 14 competing standards. so let's write a 15th standard, to unify them all. And then at the end we just have 15 competing standards. Um, so I think this is also the kind of concern that people were concerned about, but I, I think what we've seen here is that this is really not a concern that one needs to have because it ends up that, or it turns out in the end that if you implement web APIs, people will use web APIs and will use web APIs only for their new code. it takes a while, but we're seeing this with ESM versus require like new code written with require much less common than it was two years ago. And, new code now using like Xhr, whatever it's called, form request or. You know, the one, I mean, compared to using Fetch, like nobody uses that name. Everybody uses Fetch. Um, and like in Node, if you write a little script, like you're gonna use Fetch, you're not gonna use like Nodes, htp, dot get API or whatever. and we're gonna see the same thing with Readable Stream. We're gonna see the same thing with Web Crypto. We're gonna see, see the same thing with Blob. I think one of the big ones where, where Node is still, I, I, I don't think this is one that's ever gonna get solved, is the, the Buffer global and Node. like we have the Uint8, this Uint8 global, um, and like all the run times including browsers, um, and Buffer is like a super set of that, but it's in global scope. So it, it's sort of this non-standard extension of unit eight array that people in node like to use and it's not compatible with anything else. Um, but because it's so easy to get at, people use it anyway. So those are, those are also kind of problems that, that we'll have to deal with eventually. And maybe that means that at some point the buffer global gets deprecated and I don't know, probably can never get removed. But, um, yeah, these are kinds of conversations that the no TSE is going have to have internally in, I don't know, maybe five years. Write once, have it run on any hosting platform [00:22:37] Jeremy: Yeah, so at a high level, What's shipped in the browser, it went through the ECMAScript approval process. People got it into the browser. Once it's in the browser, probably never going away. And because of that, it's safe to build on top of that for these, these server run times because it's never going away from the browser. And so everybody can kind of use it into the future and not worry about it. Yeah. [00:23:05] Luca: Exactly. Yeah. And that's, and that's excluding the benefit that also if you have code that you can write once and use in both the browser and the server side around time, like that's really nice. Um, like that, that's the other benefit. [00:23:18] Jeremy: Yeah. I think that's really powerful. And that right now, when someone's looking at running something in CloudFlare workers versus running something in the browser versus running something in. it's, I think a lot of people make the assumption it's just JavaScript, so I can use it as is. But it, it, there are at least currently, differences in what APIs are available to you. [00:23:43] Luca: Yep. Yep. Why bundle so many things into Deno? [00:23:46] Jeremy: Earlier you were talking about how Deno is more than just the runtime. It has a linter, formatter, file watcher there, there's all sorts of stuff in there. And I wonder if you could talk a little bit to the, the reasoning behind that [00:24:00] Luca: Mm-hmm. [00:24:01] Jeremy: Having them all be separate things. [00:24:04] Luca: Yeah, so the, the reasoning here is essentially if you look at other modern run time or mo other modern languages, like Rust is a great example. Go is a great example. Even though Go was designed around the same time as Node, it has a lot of these same tools built in. And what it really shows is that if the ecosystem converges, like is essentially forced to converge on a single set of built-in tooling, a that built-in tooling becomes really, really excellent because everybody's using it. And also, it means that if you open any project written by any go developer, any, any rest developer, and you look at the tests, you immediately understand how the test framework works and you immediately understand how the assertions work. Um, and you immediately understand how the build system works and you immediately understand how the dependency imports work. And you immediately understand like, I wanna run this project and I wanna restart it when my file changes. Like, you immediately know how to do that because it's the same everywhere. Um, and this kind of feeling of having to learn one tool and then being able to use all of the projects, like being able to con contribute to open source when you're moving jobs, whatever, like between personal projects that you haven't touched in two years, you know, like being able to learn this once and then use it everywhere is such an incredibly powerful tool. Like, people don't appreciate this until they've used a runtime or, or, or language which provides this to them. Like, you can go to any go developer and ask them if they would like. There, there's this, there's this saying in the Go ecosystem, um, that Go FMT is nobody's favorite, but, or, uh, wait, no, I don't remember what the, how the saying goes, but the saying essentially implies that the way that go FMT formats code, maybe not everybody likes, but everybody loves go F M T anyway, because it just makes everything look the same. And like, you can read your friend's code, your, your colleagues code, your new jobs code, the same way that you did your code from two years ago. And that's such an incredibly powerful feeling. especially if it's like well integrated into your IDE you clone a repository, open that repository, and like your testing panel on the left hand side just populates with all the tests, and you can click on them and run them. And if an assertion fails, it's like the standard output format that you're already familiar with. And it's, it's, it's a really great feeling. and if you don't believe me, just go try it out and, and then you will believe me, (laughs) [00:26:25] Jeremy: Yeah. No, I, I'm totally with you. I, I think it's interesting because with JavaScript in particular, it feels like the default in the community is the opposite, right? There's so many different ways. Uh, there are so many different build tools and testing frameworks and, formatters, and it's very different than, like you were mentioning, a go or a Rust that are more recent languages where they just include that, all Bundled in. Yeah. [00:26:57] Luca: Yeah, and I, I think you can see this as well in, in the time that average JavaScript developer spends configuring their tooling compared to a rest developer. Like if I write Rust, I write Rust, like all day, every day. and I spend maybe two, 3% of my time configuring Rust tooling like. Doing dependency imports, opening a new project, creating a format or config file, I don't know, deleting the build directory, stuff like that. Like that's, that's essentially what it means for me to configure my rest tooling. Whereas if you compare this to like a front-end JavaScript project, like you have to deal with making sure that your React version is compatible with your React on version, it's compatible with your next version is compatible with your ve version is compatible with your whatever version, right? this, this is all not automatic. Making sure that you use the right, like as, as a front end developer, you developer. You don't have just NPM installed, no. You have NPM installed, you have yarn installed, you have PNPM installed. You probably have like, Bun installed. And, and, and I don't know to use any of these, you need to have corepack enabled in Node and like you need to have all of their global bin directories symlinked into your or, or, or, uh, included in your path. And then if you install something and you wanna update it, you don't know, did I install it with yarn? Did I install it with N pNPM? Like this is, uh, significant complexity and you, you tend to spend a lot of time dealing with dependencies and dealing with package management and dealing with like tooling configuration, setting up esent, setting up prettier. and I, I think that like, especially Prettier, for example, really showed, was, was one of the first things in the JavaScript ecosystem, which was like, no, we're not gonna give you a config where you, that you can spend like six hours configuring, it's gonna be like seven options and here you go. And everybody used it because, Nobody likes configuring things. It turns out, um, and even though there's always the people that say, oh, well, I won't use your tool unless, like, we, we get this all the time. Like, I'm not gonna use Deno FMT because I can't, I don't know, remove the semicolons or, or use single quotes or change my tab width to 16. Right? Like, wait until all of your coworkers are gonna scream at you because you set the tab width to 16 and then see what they change it to. And then you'll see that it's actually the exact default that, everybody uses. So it'll, it'll take a couple more years. But I think we're also gonna get there, uh, like Node is starting to implement a, a test runner. and I, I think over time we're also gonna converge on, on, on, on like some standard build tools. Like I think ve, for example, is a great example of this, like, Doing a front end project nowadays. Um, like building new front end tooling that's not built on Vite Yeah. Don't like, Vite's it's become the standard and I think we're gonna see that in a lot more places. We should settle on what tools to use [00:29:52] Jeremy: Yeah, though I, I think it's, it's tricky, right? Because you have so many people with their existing projects. You have people who are starting new projects and they're just searching the internet for what they should use. So you're, you're gonna have people on web pack, you're gonna have people on Vite, I guess now there's gonna be Turbo pack, I think is another one that's [00:30:15] Luca: Mm-hmm. [00:30:16] Jeremy: There's, there's, there's all these different choices, right? And I, I think it's, it's hard to, to really settle on one, I guess, [00:30:26] Luca: Yeah, [00:30:27] Jeremy: uh, yeah. [00:30:27] Luca: like I, I, I think this is, this is in my personal opinion also failure of the Node Technical Steering committee, for the longest time to not decide that yes, we're going to bless this as the standard format for Node, and this is the standard package manager for Node. And they did, they sort of did, like, they, for example, node Blessed NPM as the standard, package manager for N for for node. But it didn't innovate on npm. Like no, the tech nodes, tech technical steering committee did not force NPM to innovate NPMs, a private company ultimately bought by GitHub and they had full control over how the NPM cli, um, evolved and nobody forced NPM to, to make sure that package install times are six times faster than they were. Three years ago, like nobody did that. so it didn't happen. And I think this is, this is really a failure of, of the, the, the, yeah, the no technical steering committee and also the wider JavaScript ecosystem of not being persistent enough with, with like focus on performance, focus on user experience, and, and focus on simplicity. Like things got so out of hand and I'm happy we're going in the right direction now, but, yeah, it was terrible for some time. (laughs) Node compatibility layer [00:31:41] Jeremy: I wanna talk a little bit about how we've been talking about Deno in the context of you just using Deno using its own standard library, but just recently last year you added a compatibility shim where people are able to use node libraries in Deno. [00:32:01] Luca: Mm-hmm. [00:32:01] Jeremy: And I wonder if you could talk to, like earlier you had mentioned that Deno has, a different permissions model. on the website it mentions that Deno's HTTP server is two times faster than node in a Hello World example. And I'm wondering what kind of benefits people will still get from Deno if they choose to use packages from Node. [00:32:27] Luca: Yeah, it's a great question. Um, so I think a, again, this is sort of a like, so just to clarify what we actually implemented, like what we have is we have support for you to import NPM packages. Um, so you can import any NPM package from NPM, from your type script or JavaScript ECMAScript module, um, that you have, you already have for your Deno code. Um, and we will under the hood, make sure that is installed somewhere in some directory globally. Like PNPM does. There's no local node modules folder you have to deal with. There's no package of Jason you have to deal with. Um, and there's no, uh, package. Jason, like versioning things you need to deal with. Like what you do is you do import cowsay from NPM colon cowsay at one, and that will import cowsay with like the semver tag one. Um, and it'll like do the sim resolution the same way node does, or the same way NPM does rather. And what you get from that is that essentially it gives you like this backdoor to a callout to all of the existing node code that Isri been written, right? Like you cannot expect that Deno developers, write like, I don't know. There was this time when Deno did not really have that many, third party modules yet. It was very early on, and I don't know the, you either, if you wanted to connect to Postgres and there was no Postgres driver available, then the solution was to write your own Postgres driver. And that is obviously not great. Um, (laughs) . So the better solution here is to let users for these packages where there's no Deno native or, or, or web native or standard native, um, package for this yet that is importable with url. Um, specifiers, you can import this from npm. Uh, so it's sort of this like backdoor into the existing NPM ecosystem. And we explicitly, for example, don't allow you to, create a package.json file or, import bare node specifiers because we don't, we, we want to stay standards compliant here. Um, but to make this work effectively, we need to give you this little back door. Um, and inside of this back door. All hell is like, or like everything is terrible inside there, right? Like inside there you can do bare specifiers and inside there you can like, uh, there's package.json and there's crazy node resolution and underscore underscore DIRNAME and common js. And like all of that stuff is supported inside of this backdoor to make all the NPM packages work. But on the outside it's exposed as this nice, ESM only, NPM specifiers. and the, the reason you would want to use this over, like just using node directly is because again, like you wanna use TypeScript, no config, like necessary. You want to use, you wanna have a formatter you wanna have a linter, you wanna have tooling that like does testing and benchmarking and compiling or whatever. All of that's built in. You wanna run this on the edge, like close to your users and like 30 different, 35 different, uh, points of presence. Um, it's like, Okay, push it to your git repository. Go to this website, click a button two times, and it's running in 35 data centers. like this is, this is the kind of ex like developer experience that you can, you do not get. You, I will argue that you cannot get with Node right now. Like even if you're using something like ts-node, it is not possible to get the same level of developer experience that you do with Deno. And the, the, the same like speed at which you can iterate, iterate on your projects, like create new projects, iterate on them is like incredibly fast in Deno. Like, I can open a, a, a folder on my computer, create a single file, may not ts, put some code in there and then call Deno Run may not. And that's it. Like I don't, I did not need to do NPM install I did not need to do NPM init -y and remove the license and version fields and from, from the generated package.json and like set private to true and whatever else, right? It just all works out of the box. And I think that's, that's what a lot of people come to deno for and, and then ultimately stay for. And also, yeah, standards compliance. So, um, things you build in Deno now are gonna work in five, 10 years, with no hassle. Node shims and testing [00:36:39] Jeremy: And so with this compatibility layer or this, this shim, is it where the node code is calling out to node APIs and you're replacing those with Deno compatible equivalents? [00:36:54] Luca: Yeah, exactly. Like for example, we have a shim in place that shims out the node crypto API on top of the web crypto api. Like sort of, some, some people may be familiar with this in the form of, um, Browserify shims. if anybody still remembers those, it's essentially. , your front end tooling, you were able to import from like node crypto in your front end projects and then behind the scenes your web packs or your browser replies or whatever would take that import from node crypto and would replace it with like the shim that was essentially exposed the same APIs node crypto, but under the hood, wasn't implemented with native calls, but was implemented on top of web crypto, or implemented in user land even. And Deno does something similar. there's a couple edge cases of APIs that there's, where, where we do not expose the underlying thing that we shim to, to end users, outside of the node shim. So like there's some, some APIs that I don't know if I have a good example, like node nextTick for example. Um, like to properly be able to shim node nextTick, you need to like implement this within the event loop in the runtime. and. , you don't need this in Deno, because Deno, you use the web standard queueMicrotask to, to do this kind of thing. but to be able to shim it correctly and run node applications correctly, we need to have this sort of like backdoor into some ugly APIs, um, which, which natively integrate in the runtime, but, yeah, like allow, allow this node code to run. [00:38:21] Jeremy: A, anytime you're replacing a component with a, a shim, I think there's concerns about additional bugs or changes in behavior that can be introduced. Is that something that you're seeing and, and how are you accounting for that? [00:38:38] Luca: Yeah, that's, that's an excellent question. So this is actually a, a great concern that we have all the time. And it's not just even introducing bugs, sometimes it's removing bugs. Like sometimes there's bugs in the node standard library which are there, and people are relying on these bugs to be there for the applications to function correctly. And we've seen this a lot, and then we implement this and we implement from scratch and we don't make that same bug. And then the test fails or then the application fails. So what we do is, um, we actually run node's test suite against Deno's Shim layer. So Node has a very extensive test suite for its own standard library, and we can run this suite against, against our shims to find things like this. And there's still edge cases, obviously, which node, like there was, maybe there's a bug which node was not even aware of existing. Um, where maybe this, like it's is, it's now standard, it's now like intended behavior because somebody relies on it, right? Like the second somebody relies on, on some non-standard or some buggy behavior, it becomes intended. Um, but maybe there was no test that explicitly tests for this behavior. Um, so in that case we'll add our own tests to, to ensure that. But overall we can already catch a lot of these by just testing, against, against node's tests. And then the other thing is we run a lot of real code, like we'll try run Prisma and we'll try run Vite and we'll try run NextJS and we'll try run like, I don't know, a bunch of other things that people throw at us and, check that they work and they work and there's no bugs. Then we did our job well and our shims are implemented correctly. Um, and then there's obviously always the edge cases where somebody did something absolutely crazy that nobody thought possible. and then they'll open an issue on the Deno repo and we scratch our heads for three days and then we'll fix it. And then in the next release there'll be a new bug that we added to make the compatibility with node better. so yeah, but I, yeah. Running tests is the, is the main thing running nodes test. Performance should be equal or better [00:40:32] Jeremy: Are there performance implications? If someone is running an Express App or an NextJS app in Deno, will they get any benefits from the Deno runtime and performance? [00:40:45] Luca: Yeah. It's actually, there is performance implications and they're usually. The opposite of what people think they are. Like, usually when you think of performance implications, it's always a negative thing, right? It's always okay. Like you, it's like a compromise. like the shim layer must be slower than the real node, right? It's not like we can run express faster than node can run, express. and obviously not everything is faster in Deno than it is in node, and not everything is faster in node than it is in Deno. It's dependent on the api, dependent on, on what each team decided to optimize. Um, and this also extends to other run times. Like you can always cherry pick results, like, I don't know, um, to, to make your runtime look faster in certain benchmarks. but overall, what really matters is that you do not like, the first important step for for good node compatibility is to make sure that if somebody runs your code or runs their node code in Deno or your other run type or whatever, It performs at least the same. and then anything on top of that great cherry on top. Perfect. but make sure the baselines is at least the same. And I think, yeah, we have very few APIs where we behave, where we, where, where like there's a significant performance degradation in Deno compared to Node. Um, and like we're actively working on these things. like Deno is not a, a, a project that's done, right? Like we have, I think at this point, like 15 or 16 or 17 engineers working on Deno, spanning across all of our different projects. And like, we have a whole team that's dedicated to performance, um, and a whole team that's dedicated node compatibility. so like these things get addressed and, and we make patch releases every week and a minor release every four weeks. so yeah, it's, it's not a standstill. It's, uh, constantly improving. What should go into the standard library? [00:42:27] Jeremy: Uh, something that kind of makes Deno stand out as it's standard library. There's a lot more in there than there is in in the node one. [00:42:38] Luca: Mm-hmm. [00:42:39] Jeremy: Uh, I wonder if you could speak to how you make decisions on what should go into it. [00:42:46] Luca: Yeah, so early on it was easier. Early on, the, the decision making process was essentially, is this something that a top 100 or top 1000 NPM library implements? And if it is, let's include it. and the decision making is still short of based on that. But right now we've already implemented most of the low hanging fruit. So things that we implement now are, have, have discussion around them whether we should implement them. And we have a process where, well we have a whole team of engineers on our side and we also have community members that, that will review prs and, and, and make comments. Open issues and, and review those issues, to sort of discuss the pros and cons of adding any certain new api. And sometimes it's also that somebody opens an issue that's like, I want, for example, I want an API to, to concatenate two unit data arrays together, which is something you can really easily do node with buffer dot con cat, like the scary buffer thing. and there's no standards way of doing that right now. So we have to have a little utility function that does that. But in parallel, we're thinking about, okay, how do we propose, an addition to the web standards now that makes it easy to concatenate iterates in the web standards, right? yeah, there's a lot to it. Um, but it's, it's really, um, it's all open, like all of our, all of our discussions for, for, additions to the standard library and things like that. It's all, all, uh, public on GitHub and the GitHub issues and GitHub discussions and GitHub prs. Um, so yeah, that's, that's where we do that. [00:44:18] Jeremy: Yeah, cuz to give an example, I was a little surprised to see that there is support for markdown front matter built into the standard library. But when you describe it as we look at the top a hundred thousand packages, are people looking at markdown? Are they looking at front matter? I, I'm sure there's a fair amount that are so that that makes sense. [00:44:41] Luca: Yeah, like it sometimes, like that one specifically was driven by, like, our team was just building a lot of like little blog pages and things like that. And every time it was either you roll your own front matter part or you look for one, which has like a subtle bug here and the other one has a subtle bug there and really not satisfactory with any of them. So, we, we roll that into the standard library. We add good test coverage for it good, add good documentation for it, and then it's like just a resource that people can rely on. Um, and you don't, you then don't have to make the choice of like, do I use this library to do my front meta parsing or the other library? No, you just use the one that's in the standard library. It's, it's also part of this like user experience thing, right? Like it's just a much nicer user experience, not having to make a choice, about stuff like that. Like completely inconsequential stuff. Like which library do we use to do front matter parsing? (laughs) [00:45:32] Jeremy: yeah. I mean, I think when, when that stuff is not there, then I think the temptation is to go, okay, let me see what node modules there are that will let me parse the front matter. Right. And then it, it sounds like probably ideally you want people to lean more on what's either in the standard library or what's native to the Deno ecosystem. Yeah. [00:46:00] Luca: Yeah. Like the, the, one of the big benefits is that the Deno Standard Library is implemented on top of web standards, right? Like it's, it's implemented on top of these standard APIs. so for example, there's node front matter libraries which do not run in the browser because the browser does not have the buffer global. maybe it's a nice library to do front matter pricing with, but. , you choose it and then three days later you decide that actually this code also needs to run in the browser, and then you need to go switch your front matter library. Um, so, so those are also kind of reasons why we may include something in Strand Library, like maybe there's even really good module already to do something. Um, but if there's certain reliance on specific node features that, um, we would like that library to also be compatible with, with, with web standards, we'll, uh, we might include in the standard library, like for example, YAML Parser, um, or the YAML Parser in the standard library is, is a fork of, uh, of the node YAML module. and it's, it's essentially that, but cleaned up and, and made to use more standard APIs rather than, um, node built-ins. [00:47:00] Jeremy: Yeah, it kind of reminds me a little bit of when you're writing a front end application, sometimes you'll use node packages to do certain things and they won't work unless you have a compatibility shim where the browser can make use of certain node APIs. And if you use the APIs that are built into the browser already, then you won't, you won't need to deal with that sort of thing. [00:47:26] Luca: Yeah. Also like less Bundled size, right? Like if you don't have to shim that, that's less, less code you have to ship to the client. WebAssembly use cases [00:47:33] Jeremy: Another thing I've seen with Deno is it supports running web assembly. [00:47:40] Luca: Mm-hmm. [00:47:40] Jeremy: So you can export functions and call them from type script. I was curious if you've seen practical uses of this in production within the context of Deno. [00:47:53] Luca: Yeah. there's actually a Bunch of, of really practical use cases, so probably the most executed bit of web assembly inside of Deno right now is actually yes, build like, yes, build has a web assembly, build like yeses. Build is something that's written and go. You have the choice of either running. Um, natively in machine code as, as like an ELF process on, on Linux or on on Windows or whatever. Or you can use the web assembly build and then it runs in web assembly. And the web assembly build is maybe 50% slower than the, uh, native build, but that is still significantly faster than roll up or, or, or, or I don't know, whatever else people use nowadays to do JavaScript Bun, I don't know. I, I just use es build always, um, So, um, for example, the Deno website, is running on Deno Deploy. And Deno Deploy does not allow you to run Subprocesses because it's, it's like this edge run time, which, uh, has certain security permissions that it's, that are not granted, one of them being sub-processes. So it needs to execute ES build. And the way it executes es build is by running them inside a web assembly. Um, because web assembly is secure, web assembly is, is something which is part of the JavaScript sandbox. It's inside the JavaScript sandbox. It doesn't poke any holes out. Um, so it's, it's able to run within, within like very strict security context. . Um, and then other examples are, I don't know, you want to have a HTML sanitizer, which is actually built on the real HTML par in a browser. we, we have an hdml sanitizer called com or, uh, ammonia, I don't remember. There's, there's an HTML sanitizer library on denoland slash x, which is built on the html parser from Firefox. Uh, which like ensures essentially that your html, like if you do HTML sanitization, you need to make sure your HTML par is correct, because if it's not, you might like, your browser might parse some HTML one way and your sanitizer pauses it another way and then it doesn't sanitize everything correctly. Um, so there's this like the Firefox HTML parser compiled to web assembly. Um, you can use that to. HTML sanitization, or the Deno documentation generation tool, for example. Uh, Deno Doc, there's a web assembly built for it that allows you to programmatically, like generate documentation for, for your type script modules. Um, yeah, and, and also like, you know, deno fmt is available as a WebAssembly module for programmatic access and a Bunch of other internal Deno, programs as well. Like, or, uh, like components, not programs. [00:50:20] Jeremy: What are some of the current limitations of web assembly and Deno for, for example, from web assembly, can I make HTTP requests? Can I read files? That sort of thing. [00:50:34] Luca: Mm-hmm. . Yeah. So web assembly, like when you spawn as web assembly, um, they're called instances, WebAssembly instances. It runs inside of the same vm, like the same, V8 isolate is what they're called, but. it does not have it, it's like a completely fresh sandbox, sort of, in the sense that I told you that between a runtime and like an engine essentially implements no IO calls, right? And a runtime does, like a runtime, pokes holds into the, the, the engine. web assembly by default works the same way that there is no holes poked into its sandbox. So you have to explicitly poke some holes. Uh, if you want to do HTTP calls, for example, when, when you create web assembly instance, it gives you, or you can give it something called imports, uh, which are essentially JavaScript function bindings, which you can call from within the web assembly. And you can use those function bindings to do anything you can from JavaScript. You just have to pass them through explicitly. and. . Yeah. Depending on how you write your web assembly, like if you write it in Rust, for example, the tooling is very nice and you can just call some JavaScript code from your Rust, and then the build system will automatically make sure that the right function bindings are passed through with the right names. And like, you don't have to deal with anything. and if you're writing go, it's slightly more complicated. And if you're writing like raw web assembly, like, like the web assembly, text format and compiling that to a binary, then like you have to do everything yourself. Right? It's, it's sort of the difference between writing C and writing JavaScript. Like, yeah. What level of abstraction do you want? It's definitely possible though, and that's for limitations. it, the same limitations as, as existing browsers apply. like the web assembly support in Deno is equivalent to the web assembly support in Chrome. so you can do, uh, many things like multi-threading and, and stuff like that already. but especially around, shared mutable memory, um, and having access to that memory from JavaScript. That's something which is a real difficulty with web assembly right now. yeah, growing web assembly memory is also rather difficult right now. There's, there's a, there's a couple inherent limitations right now with web assembly itself. Um, but those, those will be worked out over time. And, and Deno is like very up to date with the version of, of the standard, it, it implements, um, through v8. Like we're, we're, we're up to date with Chrome Beta essentially all the time. So, um, yeah. Any, anything you see in, in, in Chrome beta is gonna be in Deno already. Deno Deploy [00:52:58] Jeremy: So you talked a little bit about this before, the Deno team, they have their own, hosting. Platform called Deno Deploy. So I wonder if you could explain what that is. [00:53:12] Luca: Yeah, so Deno has this really nice, this really nice concept of permissions which allow you to, sorry, I'm gonna start somewhere slightly, slightly unrelated. Maybe it sounds like it's unrelated, but you'll see in a second. It's not unrelated. Um, Deno has this really nice permission system which allows you to sandbox Deno programs to only allow them to do certain operations. For example, in Deno, by default, if you try to open a file, it'll air out and say you don't have read permissions to read this file. And then what you do is you specify dash, dash allow read um, maybe you have to give it. they can either specify, allow, read, and then it'll grant to read access to the entire file system. Or you can explicitly specify files or folders or, any number of things. Same goes for right permissions, same goes for network permissions. Um, same goes for running subprocesses, all these kind of things. And by limiting your permissions just a little bit. Like, for example, by just disabling sub-processes and foreign function interface, but allowing everything else, allowing reeds and allowing network access and all that kind of stuff. we can run Deno programs in a way that is significantly more cost effective to you as the end user than, and, and like we can cold start them much faster than, like you may be able to with a, with a more conventional container based, uh, system. So what, what do you, what Deno Deploy is, is a way to run JavaScript or Deno Code, on our data centers all across the world with very little latency. like you can write some JavaScript code which execute, which serves HTTP requests deploy that to our platform, and then we'll make sure to spin that code up all across the world and have your users be able to access it through some URL or, or, or some, um, custom domain or something like that. and this is some, this is very similar to CloudFlare workers, for example. Um, and it's like Netlify Edge functions is built on top of Deno Deploy. Like Netlify Edge functions is implemented on top of Deno Deploy, um, through our sub hosting product. yeah, essentially Deno Deploy is, is, um, yeah, a cloud hosting service for JavaScript, um, which allows you to execute arbitrary JavaScript. and there there's a couple, like different directions we're going there. One is like more end user focused, where like you link your GitHub repository and. Like, we'll, we'll have a nice experience like you do with Netlify and Versace, that word like your commits automatically get deployed and you get preview deployments and all that kind of thing. for your backend code though, rather than for your front end websites. Although you could also write front-end websites and you know, obviously, and the other direction is more like business focused. Like you're writing a SaaS application and you want to allow the user to customize, the check like you're writing a SaaS application that provides users with the ability to write their own online store. Um, and you want to give them some ability to customize the checkout experience in some way. So you give them a little like text editor that they can type some JavaScript into. And then when, when your SaaS application needs to hit this code path, it sends a request to us with the code, we'll execute that code for you in a secure way. In a secure sandbox. You can like tell us you, this code only has access to like my API server and no other networks to like prevent data exfiltration, for example. and then you do, you can have all this like super customizable, code in inside of your, your SaaS application without having to deal with any of the operational complexities of scaling arbitrary code execution, or even just doing arbitrary code execution, right? Like it's, this is a very difficult problem and give it to someone else and we deal with it and you just get the benefits. yeah, that's Deno Deploy, and it's built by the same team that builds the Deno cli. So, um, all the, all of your favorite, like Deno cli, or, or Deno APIs are available in there. It's just as web standard is Deno, like you have fetch available, you have blob available, you have web crypto available, that kind of thing. yeah. Running code in V8 isolates [00:56:58] Jeremy: So when someone ships you their, their code and you run it, you mentioned that the, the cold start time is very low. Um, how, how is the code being run? Are people getting their own process? It sounds like it's not, uh, using containers. I wonder if you could explain a little bit about how that works. [00:57:20] Luca: Yeah, yeah, I can, I can give a high level overview of how it works. So, the way it works is that we essentially have a pool of, of Deno processes ready. Well, it's not quite Deno processes, it's not the same Deno CLI that you download. It's like a modified version of the Deno CLI based on the same infrastructure, that we have spun up across all of our different regions across the world, uh, across all of our different data centers. And then when we get a request, we'll route that request, um, the first time we get request for that, that we call them deployments, that like code, right? We'll take one of these idle Deno processes and will assign that code to run in that process, and then that process can go serve the requests. and these process, they're, they're, they're isolated and they're, you. it's essentially a V8 isolate. Um, and it's a very, very slim, it's like, it's a much, much, much slimmer version of the Deno cli essentially. Uh, which the only thing it can do is JavaScript execution and like, it can't even execute type script, for example, like type script is we pre-process it up front to make the the cold start faster. and then what we do is if you don't get a request for some amount of. , we'll, uh, spin down that, um, that isolate and, uh, we'll spin up a new idle one in its place. And then, um, if you get another request, I don't know, an hour later for that same deployment, we'll assign it to a new isolate. And yeah, that's a cold start, right? Uh, if you have an isolate which receives, or a, a deployment rather, which receives a Bunch of traffic, like let's say you receive a hundred requests per second, we can send a Bunch of that traffic to the same isolate. Um, and we'll make sure that if, that one isolate isn't able to handle that load, we'll spin it out over multiple isolates and we'll, we'll sort of load balance for you. Um, and we'll make sure to always send to the, to the point of present that's closest to, to the user making the request. So they get very minimal latency. and they get we, we've these like layers of load balancing in place and, and, and. I'm glossing over a Bunch of like security related things here about how these, these processes are actually isolated and how we monitor to ensure that you don't break out of these processes. And for example, Deno Deploy does, it looks like you have a file system cuz you can read files from the file system. But in reality, Deno Deploy does not have a file system. Like the file system is a global virtual file system. which is, is, uh, yeah, implemented completely differently than it is in Deno cli. But as an end user you don't have to care about that because the only thing you care about is that it has the exact same API as the Deno cli and you can run your code locally and if it works there, it's also gonna work in deploy. yeah, so that's, that's, that's kind of. High level of Deno Deploy. If, if any of this sounds interesting to anyone, by the way, uh, we're like very actively hiring on, on Deno Deploy. I happen to be the, the tech lead for, for a Deno Deploy product. So I'm, I'm always looking for engineers, to, to join our ranks and, and build cool distributed systems. Deno.com/jobs. [01:00:15] Jeremy: for people who aren't familiar with the isolates, are these each run in their own processes, or do you have a single process and that has a whole Bunch of isolates inside it? [01:00:28] Luca: in, in the general case, you can say that we run, uh, one isolate per process. but there's many asterisks on that. Um, because, it's, it's very complicated. I'll just say it's very complicated. Uh, in, in the general case though, it's, it's one isolate per process. Yeah. Configuring permissions [01:00:45] Jeremy: And then you touched a little bit on the permissions system. Like you gave the example of somebody could have a website where they let their users give them code to execute. how does it look in terms of specifying what permissions people have? Like, is that a configuration file? Are those flags you pass in? What, what does that look? [01:01:08] Luca: Yeah. So, so that product is called sub hosting. It's, um, slightly different from our end user platform. Um, it's essentially a service that allows you to, like, you email us, well, we'll send you a, um, onboard you, and then what you can do is you can send HTTP requests to a certain end point with a, authentication token and. a reference to some code to execute. And then what we'll do is, we'll, um, when we receive that HTTP request, we'll fetch the code, it's spin up and isolate, execute the code. execute the code. We serve the request, return you the response, um, and then we'll pipe logs to you and, and stuff like that. and the, and, and part of that is also when we, when we pull the, um, the, the code for to spin up the isolate, that code doesn't just include the code that we're executing, but also includes things like permissions, and, and various other, we call this isolate configuration. Um, you can inspect, this is all public. we have public docs for this at Deno.com/subhosting. I think. Yes, Deno.com/subhosting. [01:02:08] Jeremy: And is that built on top of something that's a part of the public Deno project, the open source part? Or is this specific to this sub hosting product? [01:02:19] Luca: Um, so the underlying engine or underlying runtime that executes the code here, like all of the code execution is performed by code, which is execute, which is public. Like all our, our, yeah, it uses, the Deno CLI just strips out a Bunch of stuff. It doesn't need the orchestration code, is not public. The orchestration code is proprietary. and yeah, if you have use cases that where you would like to run this orchestration code on your own infrastructure, and yeah, you have interesting use cases, please email us. We would love to hear from you. [01:02:51] Jeremy: separate from the, the orchestration, if it's more of an example of, let's say I deploy a Deno application and in the case that someone was able to get some, like malicious code or URLs into my application, could I tell Deno I only want this application to be able to call out to these URLs for just as an example? [01:03:18] Luca: yes. So it's, it's slightly more complicated because you can't actually tell it that it can only call out to specific URLs, but you can tell it to call out only to specific domains or IP addresses. which sort of the same thing, but, uh, just slightly different layer of abstraction. Yeah, you can do that. the allow net flag, allows you to specify a set of domains to allow requests to those domains. Yes, [01:03:41] Jeremy: I see. So on the, user facing open source part, there are configuration flags where you could say, I want this application to be able to access these domains, or I don't want it to be able to use IO or whatever [01:03:56] Luca: Yeah, exactly. [01:03:57] Jeremy: their, their flags. [01:03:59] Luca: Yeah. And, and on, on subhosting, this is done via the isolate configuration, which is like a JSON blob. And, yeah, like there, there's, it's, but ultimately it's all, it's all sort of, the same concept, just slightly different interfaces because like, like the, the subhosting one needs to be programmatic interface instead of, uh, something you type as an end user. Right? Why deploy your application on the edge? [01:04:20] Jeremy: One of the things you mentioned about Deno Deploy is it's, centered around deploying your application code to a Bunch of different locations. And you also mentioned the, the cold start times very low. could you kind of give the case for wanting your application code at a Bunch of different sites? [01:04:38] Luca: Mm-hmm. . Yeah. So the, the, the, the main benefit of this is that when your user makes request for your application, um, you don't have to round trip back to, um, wherever centrally hosted your application would otherwise be. Like, if you are, a, a startup, even if you're just in the US for example, it's nice to have, points of presence, not just on one of the US coasts, but on both of the US coasts because that means that your round trip time is not gonna be a hundred milliseconds, but it's gonna be 20 milliseconds. this sort of relies on. the, like, this doesn't, there's obviously always the problem here that if your database lives in only one of the two coasts, you still need to do the round trip. And there's solutions to this, which is one caching, uh, that's the, the, the obvious sort of boring solution. Um, and then there's the solution of using databases which are built exactly for this. For example, CockroachDB is a database which is Postgres compatible, but it's really built for, um, global distribution and built for being able to shard data across regions and have different, um, primary regions for different, uh, shards of your, of your, of your tables. which means, for example, you could have the, your users on the East coast, their data could live on a database in the east coast and your users on the west coast, their data could live on a database on the west coast. and. your like admin panel needs to show all of them. It has an aggregate view over both coasts, right? like this is something which, which something like CockroachDB can do and it can be a really great, um, great thing here. And, we acknowledge that this is not something which is very easy to do right now and Deno tries to make everything very easy. So you can imagine that this is something we're working on and we're working on, on database solutions. And actually I should more generally say persistent solutions that allow you to persist data, in a way that makes sense for an edge system like this. Um, where the data has persisted close to users that need it. Consistency in local development vs deployment [01:06:44] Luca: Um, and data is cached around the world. and you still have sort of semantics, which, which are consistent with the semantics that you have, when you're locally developing your application. Like you don't want, for example, your local application development. , you don't want there to be like strong consistency there, but then in production you have eventual consistency where suddenly, I don't know, all of your code breaks because you didn't, your US west region didn't pick up the changes from US east because it's eventually consistent, right? I mean, this is a problem that we see with a lot of the existing solutions here. like specifically CloudFlare KV for example. CloudFlare KV is, um, a single primary is a system with, with single primary, um, right regions, where there's just a Bunch of caching going on. And this leads to ventral consistency, which can be very confusing for, for end user developers. Um, especially because if you're using this locally, the local emulator does not emulate the eventual consistency, right? so this, this, this can become very confusing very quickly. And so a, anything that we build in, in this persistence field, for example, we take very, we very seriously, um, Weigh these trade offs and make sure that if there's something that's eventually consistent, it's very clear and it works the same way, the same eventually consistent way in the cli. [01:08:03] Jeremy: So for someone, let's say they haven't made that jump yet to use a cockroach. They, they just have their. their database instance in AWS East or whatever. does having the code at the edge where it all ends up needing to go to east, is that better than having the code be located next to the database? [01:08:27] Luca: Yeah. Yeah. It, it, it totally does. Um, there's, there's def there's different, um, there, there's trade-offs here, right? Obviously, like if you have a, a, if you have an admin panel, for example, or a, a like user dashboard, which is very, very reliant on data from your database, and for every single request needs to fetch fresh data from the database, then maybe the trade off isn't worth it. But most applications are not like that. Most applications are, for example, you have a landing page and that landing page needs to do AB tests. and those AB tests are based on some heuristic that you can fetch from the database every five seconds. That's fine. Like, it doesn't need to be perfect, right? So you, you have caching in place, which, um, like by doing this caching locally to the user, um, and, and still being able to programmatically control this, like based on, I don't know, the user's user agent or, the IP address of the user or the region of the user, or. the past browsing history of that user through as, as measured by their cookies or whatever else, right? being able to do these highly user customized actions very close to the user, means that like latency is, is like, this is a much better user experience than if you have to do the roundtrip, especially if you're a, a startup or, or, or, or a, um, service which is globally distributed and, and serves not just users in the US or the EU but like all across the world. Caching options [01:09:52] Jeremy: And when you talk about caching in the context of Deno Deploy, is there a cache native to the system or are you expecting someone to have, uh, a Redis or a memcached, that sort of thing? [01:10:07] Luca: Yeah. So Deno Deploy, actually has, there's a built, there's a, there's a web cache api, um, which is also the web cache API that's used by service workers and, and others. and CloudFlare also implements this cache api. Um, and this is something that's implemented in Deno cli, and it's gonna be coming to Deploy this quarter, which is, that's the native way to do caching, and otherwise you can also use Redis you can use services like Upstash or, uh, even like primitive in-memory caches where it's just an LRU that's in memory, like a, like a JavaScript data structure, right? or even just a JavaScript map or JavaScript object, with a, with a time on it. And you automatically, and like every time you read from it and the time is above some certain threshold, you delete the cache and go fetch it again, right? Like this is, there's many things that you could consider a cache that are not like Redis or, or, or, or like the web cache api. So there's, there's ways to do that. And there's also a Bunch of, like, modules on, in the standard library, or not in the standard library story in the, in the third party module registry and also on NPM that you can use to, to implement different cache behaviors. [01:11:15] Jeremy: And when you give the example of a in memory cache, when you're running in Deno deploy, you're running in these isolates, which presumably can be shut down at any time. So what kind of guarantees do users have that whatever they put into memory will still be there? [01:11:34] Luca: none like the, it's, it's a cache, right? The cache can be evicted at any time. Your isolate can be restarted at any time. It can be shut down. You can be moved to a different region. The data center could go for, go down for maintenance. Like this is something your application has to be built in, in a way that it is tolerant to, to restarts essentially. but because it's a cache, that's fine. Because if the cache expires or, or, or the cache is cleared through some external means, the worst thing that happens is that you have a cold request again, right? And, if you're serving like a hundred requests a second, I can essentially guarantee to you that not every single request will invoke a cold start. Like, I can guarantee to you that probably less than 0.1% of requests will, will cause a cold start. this is not like SLA anywhere. Um, because it's like totally up to, to however the, the system decides to scale you. but yeah, like it's, it, it would be very wasteful for us, for example, to spin up a new isolate for every request. So we don't, we reuse isolates wherever possible. yeah. It's like it's in our best interest to not cold start you, um, because it's expensive for us to do all the CPU work to, to cold start an isolate, right? Working with CDNs [01:12:47] Jeremy: and typically with applications, people will put a, a CDN in front and they'll use things like cache control headers to be able to serve straight from the CDN Is that a supported use case with Deno Deploy or are there anything that, anything that people should be aware of when they're doing that sort of thing? [01:13:09] Luca: Yeah, so you can do that. Um, like you could put a cache in front of Deploy but in most cases it's really not necessary. Um, because the main reasons people use CDNs is, it is essentially to like do this global distribution problem, right? Like you, you want to be able to cache close to users, but if your end application is already executing close to users, the cost of a, of a, of serving something, like serving a request from a JavaScript cache is like marginal. It's so low. there's, there's like no nearly no CPU time involved here. it's, it's network bandwidth. That's the, that's the limiting factor and that's the limiting factor for all CDNs. Uh, so, so whether you're serving on Deploy or you have a, a separate CDN that you put in front of it, hmm. not really that big a difference. Like you can do it. but I don't know. Deno.com doesn't, or, or, and Deno.land, like they don't have a CDN in front of them. They're running bare on, on Deno Deploy and, yeah, it's fine. [01:14:06] Jeremy: So for, even for things like images, for example, something that. Somebody might store in object storage and put a CDN in in front. [01:14:17] Luca: Mm-hmm. [01:14:18] Jeremy: are you suggesting that people could put it on Deno deployed directly or just kind of curious what your thoughts are there? [01:14:26] Luca: Yeah. Uh, like if you have a blog and your profile image is, is part of your blog, right? And you can put that in your static file folder and serve that directly from your Deno Deploy application, like that's totally cool. Uh, you should do that because that's obvious and that's the obvious way to do things. if you're specifically building like a, image serving CDN , go reach out to us because we'd love to work with you. But also, um, like there's probably different constraints that you have. Um, like you probably very, very, very much care about network bandwidth costs, um, because that is like your one number one primary cost factor. so yeah, it's just what's the trade off? What, what trade-offs are you willing to make? Like does some other provider give you a lower network bandwidth cost? I would argue that if you're building an, like an image cdn, then you'd probably, like, even if you have to write your application code in Haskell or in whatever, it's probably worth it if you can get like a cent cheaper gigabyte transfer fees. just because that is like 100% of your, of your costs, um, is, is network bandwidth. So it's really a trade off based on what, what you're trying to build. Workloads currently not handled by Deno Deploy (Coming soon) [01:15:36] Jeremy: And if I understand correctly, Deno Deploy, it's centered around applications. That take HTTP requests. So it could be a website, it could be an API that sort of thing. and sometimes when people build applications, they have other things surrounding them. They'll, they'll need scheduled jobs. They may need some form of message queue, things like that. Things that don't necessarily fit into what Deno Deploy currently hosts. And so I wonder for things like that, what you recommend people would do while working with Deno Deploy. [01:16:16] Luca: Yeah. Great question. unfortunately I can't tell you too much about that without, like, spoiling everything (laughs), but what I'm gonna say is you should keep your eyes peeled on our blog over the next two to three months here. I consider message queues and like, especially message queues they are a persistence feature and we are currently working on persistence features. So yeah, that's all I'm gonna say. But, uh, you can expect Deno deployed to do things other than, um, just HTTP requests in the not so far. Future, and like cron jobs and stuff like that. Also, uh, at some point, yeah. Who's using deno? [01:16:54] Jeremy: All right. We'll look, we'll look out for that I guess as we wrap up, maybe you could give some examples of who's using Deno and, and what types of projects do you think are are ideal for Deno? [01:17:11] Luca: Yeah. yeah. Uh, Deno or Deno Deploy, like do you know, like, do you know as in all of Deno or Deno deploy specifically? [01:17:17] Jeremy: I, I mean, I guess either (laughs) [01:17:19] Luca: Okay. . Okay. Okay. Yeah, yeah. Uh, let's, let's do it. So, one really cool use case, for example, for Deno is Slack. Uh, slack has this app platform that they're building, um, which allows you to execute arbitrary JavaScript from within inside of Slack, in response to like slash commands and like actions. I dunno if you've ever seen like those little buttons you can have in messages if you press one of those buttons, like that can execute some Deno code. And Slack has built like this entire platform around that, and it makes use of Deno's like security features and, and built in tooling and, and all that kind of thing. Um, and that's really cool. And Netlify has built edge functions like, which is like a really, really awesome primitive they have for, for being able to customize outgoing requests to even, come up with completely new requests on the spot, um, as part of their CDN layer. Uh, also built on top of Deno. And GitHub has built, like this platform called, flat, which allows you to like sort of, um, on cron schedules, pull data, um, into git repositories and, and process that and, and post-process that and, and, and do do things with that. And it's integrated with GitHub actions, all kind of thing. It's kind of cool. Supabase also has some Edge has like an Edge functions product that's built on top of Deno. I'm just thinking about other, like those are, those are the obvious ones that are on the homepage. there's, I, I know for example, there's a image CDN actually that's serves images with Deno, like 400 million of them a day. kind of related to what we were talking about earlier. Actually, I don't know if it's still 400 million. I think it's more, um, the last data I got from them was like maybe eight months ago. So probably more at this point. Um, . Yeah. A Bunch of cool, cool, cool things like that. Um, we have like a really active discord channel and there's always people showcasing what kind of stuff they built in there that we have a showcase channel. I think that's like, if, if you're really interested in like what people are, what cool things people are building with, you know, that's like, that's a great place to, to look. I think actually we maybe also have a showcase. Do we have Deno.land/showcase? I don't remember. Show case. Oh yeah, we do Deno.com/showcase, which is a page of like a Bunch of Yeah. Projects built with Deno or, or, or products using Deno or, um, other things like that. [01:19:35] Jeremy: Cool. if people wanna learn more about Deno or see what you're up to, where should they head? [01:19:42] Luca: Yeah. Uh, if you wanna learn more about Deno Cli, head to Deno.land. If you wanna learn more about Deno Deploy, head to Deno.com/deploy. Um, if you want to chat to me, uh, you can hit me up on my website, lcas.dev. if you wanna chat about Deno, you can go to discord.gg/deno. yeah, and if you're interested in any of this and thought that maybe you have something to contribute here, you can either become an open source contributor on our open source project, or this is really something you wanna work on and you like distributed systems or systems engineering or fast performance, head to deno.com/jobs and, send in your resume. We're, we're very actively hiring and, be super excited to, to, work with you. [01:20:20] Jeremy: All right, Luca. Well thank you so much for coming on Software Engineering Radio. [01:20:24] Luca: Thank you so much for having me.

Jan 10, 2023 • 1h 38min

Megan Cutrofello on Leaguepedia

Leaguepedia is a MediaWiki instance that covers tournaments, teams, and players in the League of Legends esports community. It's relied on by fans, analysts, and broadcasters from around the world. Megan "River" Cutrofello joined Leaguepedia in 2014 as a community manager and by the end of her tenure in 2022 was the lead for Fandom's esports wikis. She built up a community of contributing editors in addition to her role as the primary MediaWiki developer. She writes on her blog and is a frequent speaker at the Enterprise MediaWiki Conference Topics covered: When to use MediaWiki Visual vs code editor MediaWiki's rough syntax Templates and markup Limiting user input to simplify pages Choosing not to transliterate long player names in certain languages Handling mobile clients Building aliases for search results Creating a single source of truth Roster changes and caching Cargo (Query data in MediaWiki templates using SQL) Hiding implementation details from editors Optimizing for the editor, not a clean codebase Training your users to use workarounds MediaWiki only supports es5 The wiki aesthetic Who is working on the wiki + onboarding Who is using the wiki The future of Leaguepedia How Megan got into wiki development Issues as opportunities to onboard Related Links River Writes - Megan's Blog Leaguepedia - League of Legends esports wiki MediaWiki VisualEditor VueJS in MediaWiki Open issue to support ES6 in MediaWiki Whitespace programming language Lua MediaWiki extensions CharInsert - Add code snippets into the MediaWiki editor Semantic MediaWiki (SMW) - Store and query data inside Wiki pages Cargo - Replaced SMW at Leaguepedia Conference Talks Usage of Cargo with Lua on LoL Gamepedia Mediawiker SublimeText plugin Cargo/Lua Best Practices, and When Not To Use Them MediaWiki Lua Tutorial Editing your wiki with Python is easier than you think Other podcast appearances Between the Brackets Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: Today I'm talking to Megan Cutrofello. She managed the Leaguepedia eSports wiki for eight years, and in 2017 she got an award for being the unsung hero of the year for eSports. So Megan, thanks for joining me today. [00:00:17] Megan: Thanks for having me. [00:00:19] Jeremy: A lot of the people I talk to are into web development, so they work with web frameworks and things like that. And I guess when you think about it, wikis are web development, but they're kind of their own world, I suppose. for someone who's going to build some kind of a site, like when does it make sense for them to use a wiki versus, uh, a content management system or just like a more traditional web framework? [00:00:55] Megan: I think it makes the most sense to use a wiki if you're going to have a lot of contributors and you don't want all of your contributors to have access to your server. also if your contributors aren't necessarily as tech savvy as you are, um, it can make sense to use a wiki. if you have experience with MediaWiki, I guess it makes sense to use a Wiki. Anytime I'm building something, my instinct is always, oh, I wanna make a Wiki (laughs) . Um, so even if it's not necessarily the most appropriate tool for the job, I always. My, my first thought is, hmm, let's see, I'm, I'm making a blog. Should I make my blog in in MediaWiki? Um, so, so I always, I always wanna do that. but I think it's always, when you're collaborating is pretty much, you always wanna do MediaWiki [00:01:47] Jeremy: And I, I think that's maybe an important point when you say people are collaborating. When I think about Wikis, I think of Wikipedia, uh, and the fact that I can click the edit button and I can see the markup right there, make a change and, and click save. And I didn't even have to log in or anything. And it seems like that workflow is built into a wiki, but maybe not so much into your typical CMS or WordPress or something like that. [00:02:18] Megan: Yeah. Having a public ability to solicit contributions from anyone. so for Leaguepedia, we actually didn't have open contributions from the public. You did have to create an account, but it's still that open anyone can make an account and all you have to do is like, go through that one step of create an account. Admittedly, sometimes people are like, I don't wanna make an account that's so much work. And we're like, just make the account. Come on. It's not that hard. but, uh, you still, you're a community and you want people to come and contribute ideas and you want people to come and be a part of that community to, document your open source project or, record the history of eSports or write down all of the easter eggs that you find in a video game or in a TV show, or in your favorite fantasy novels. Um, and it's really about community and working together to create something where the whole is bigger than the sum of its parts. [00:03:20] Jeremy: And in a lot of cases when people are contributing, I've noticed that on Wikipedia when you edit, there's an option for a, a visual editor, and then there's one for looking at the raw markup. in, in your experience, are people who are doing the edits, are they typically using the visual editor or are they mostly actually editing the, the markup? [00:03:48] Megan: So we actually disabled the Visual editor on Leaguepedia, because the visual editor is not fantastic at knowing things about templates. Um, so a template is when you have one page that gets its content pulled into the larger page, and there's a special syntax for that, and the visual editor doesn't know a lot about that. Um, so that's the first reason. And then the second reason is that, there's this, uh, one extension that we use that allows you to make a clickable, piece of text. It's called (https://www.mediawiki.org/wiki/Extension:CharInsert) CharInserts, uh, for character inserts. so I made a lot of these things that is sort of along the same philosophy as Visual Editor, where it's to help people not have to have the same burden of knowledge, of knowing every exact piece of source that has to be inserted into the page. So you click the thing that says like, um, insert a pick and band prefill, and then a little piece of JavaScript fires and it inserts a whole bunch of Wiki text and then you just enter the champions in the correct places. In the prefills of champions are like the characters that you play in, uh, league of Legends. And so then you have like the text is prefilled for you and you only have to fill in into this outline. so Visual Editor would conflict with CharInserts, and I much preferred the CharInserts approach where you have this compromise in between the never interacting with source and having to have all of the source memorized. So between the fact that Visual Editor like is not a perfect tool and has these bugs in it, and also the fact that I preferred CharInserts, we didn't use Visual Editor at all. I know that some wikis do like to use Visual Editor quite a bit, and especially if you're not working with these templates where you have all of these prefills, it can be a lot more preferred to use Visual Editor. Visual Editor is an experience much more similar to editing something like Microsoft Word, It doesn't feel like you're editing code. and editing code is, I mean, it's scary. Like for, and when I said like, MediaWiki is when you have editors who aren't as tech savvy, as the person who set up the Wiki. for people who don't have that experience, I mean, when you just said like you have to edit a wiki, like someone who's never done that before, they can be very intimidated by it. And you're trying to build a sense of community. You don't want to scare away your potential editors. You want everyone to be included there. So you wanna do everything possible to make everyone feel safe, to contribute their ideas to the Wiki. and if you make them have to memorize syntax, like even something that to me feels as simple as like two open brackets and then the name of a page, and then two closed brackets means linking the page. Like, I mean, I'm used to memorizing a lot of syntax because like, I'm a programmer, but someone who's never written code before, I mean, they're not used to memorizing things like that. So they wanna be able to click a button that says insert link, and then type the name of the page in the middle of the things that pop up there. Um, so visual editor is. It's a lot safer to use. so a lot of wikis do prefer that. and if it, if it didn't have the bugs with the type of editing that my Wiki required, and if we weren't using CharInserts so much, we definitely would've gone for it. But, um, it wasn't conducive to the wiki that I built, so we didn't use it at all. [00:07:42] Jeremy: And the, the compromise you're referring to, is it where the editor sees the raw markup, but then they can, there's like little buttons on the side they can click and they'll know, okay, if I click this one, then it's going to give me the text for creating a list or something like that. [00:08:03] Megan: Yeah, it's a little bit more high level than creating a list because I would never even insert the raw syntax for creating a list. It would be a template that's going to insert a list at the very end. but basically that, yeah, [00:08:18] Jeremy: And I, I know for myself, even though I do software development, if I click at it on a wiki and there's all the different curly brace tags, there's the square tags, and. I think if you spend some time with it, you can kind of get a sense of what it means. But for the average person who doesn't work with software in their day to day, do, do you find that, is that a big barrier for them where they, they click edit and there's all this stuff that they don't really understand? Is that where some people just, they go, oh, I don't, I don't know what to do. [00:08:59] Megan: I think the biggest barrier is actually clicking at it in the first place. so that was a big barrier to me actually. I didn't wanna click at it in the first place, and I guess my reasons were maybe a little bit different where for me it was like, I know that if I click edit, this is going to be a huge rabbit hole and I'm going to learn way too much about wikis and this is going to consume my entire life and look where I ended up. So I guess I was pretty right about that. I don't know if other people feel the same way or if they just like, don't wanna get involved at all. but I think once people, click edit, they're able to figure it out pretty well. I think there's, there's two barriers or maybe three barriers. the first one is clicking edit in the first place. The second one is if they learn to code templates at all. Media Wiki syntax is literally the worst I have encountered other than programming languages that are literally parodies. So like the white space language is worse (laughs https://en.wikipedia.org/wiki/Whitespace_(programming_language)) , but like it's two curly braces for a template and it's three curly braces for a variable. And like, are you actually kidding me? One of my blog posts is like a plea to editors to write a comment saying the name of the template that they're ending because media wiki like doesn't provide any syntax for what you're ending. And there's no, like, there's no indentation. So you can't visually see what you're ending. And there's no. So when I said the white sp white space language, that was maybe appropriate because MediaWiki prints all of the white space because it's really just like, PHP functions that are put into the text that you're literally putting onto the page. So any white space that you put gets printed. So the only way to put white space into your code is if you comment it out. So anytime you wanna put a new line, you have to comment out your new line. And if you wanna indent your code, you have to comment out the indents. So it's just, I, I'm , I'm not exaggerating here. It's, it's just the worst. Occasionally you can put a little bit of white space. Because there's like some divisions in parser functions that get handled when it gets sent to the parser. And, but I mean, for the most part it's just, it's just terrible. so if I'm like writing an if statement, I'll write if, and then I'll write a commented out endif at the end, so once an editor starts to write templates, like with parser functions and stuff, that's another big barrier because, and that's not because like people don't know how to code, it's just because the MediaWiki language, and I use language very loosely, it's like this collection of PHP functions poured into this just disaster It's just, it's not good! (laughs) And the, the next barrier is when people start to jump to Lua, which is just, I mean, it's just Lua where you can write Lua modules and then, Lua is fine. It's great, it has white space and you can make new lines and it's absolutely fine and you can write an entire code base and as long as you're writing Lua, it's, it's absolutely fantastic and there's nothing wrong with it anymore (laughs) So as much as I just insulted the MediaWiki language, like writing Lua in MediaWiki is great (laughs) . So for, for most of my time I was writing Lua. Um, and I have absolutely no complaints about that except that Lua is one index, but actually the one indexing of Lua is fine because MediaWiki itself is one indexed. So people complain about Lua being one index, and I'm like, what are you talking about? If it's, if another language were used, then you'd have all of this offsetting when you go to your scripting language because you'd have like the first argument from your template in MediaWiki going into your scripting language, and then you'd have to offset it to zero and everyone would be like vastly confused about what's going on. So you should be thankful that they picked a language that's one index because it saves you all of this headache. So anyway, sorry for that tangent, but it's very good that we picked a one index language. [00:13:17] Jeremy: When you were talking about the, the if statement and having to put in comments to have white space, is it, cuz like when I think about an if statement in most languages, the, the if statement isn't itself rendering anything, it's like deciding if you're going to do something inside of the, if so. like what, what would that white space do if you didn't comment it out in the context of the if? [00:13:44] Megan: So actually you would be able to put some white space inside of an if statement, but you would not be able to put any white space after an if statement. and there, most likely inside of the if statement, you're printing variables or putting other parser functions. and the other parser functions also end in like two curly braces. And, depending on what you're printing, you're likely ending with a series of like five or eight, or, I don't know, some very large set of curly braces. And so what I like to do is I would like to be able to see all of the things that I'm ending with, and I wanna know like how far the nesting goes, right. So I wanna write like an end if, and so I have to comment that out because there's no like end if statement. so I comment out an end if there, it's more that you can't indent the statements inside of the if, because anything that you would be printing inside of your code would get printed. So if I like write text inside of the code, then that indentation would get printed into the page. And then if I put any white space after the if statement, then that would also get printed. So technically you can have a little bit of white space before the curly braces, but that's only because it's right before the curly braces and PHP will strip the contents right inside of the parser function. So basically if PHP is stripping something, then you're allowed to have white space there. But if PHP isn't stripping anything, then all of the white space is going to be printed and it's like so inconsistent that for the most part it's not safe to put white space anywhere because you don't, you have to like keep track of am I in a location where PHP is going to be stripping something right now or not? and I, I wanna know what statement or what variable or what template I'm closing at any location. So I always want to, write out what I'm closing everywhere. And then I have to comment that because there was no foresight to put like an end if clause in this white space, sensitive language. [00:16:22] Jeremy: Yeah, I, I think I see what you mean. So you have, if you're gonna start an, if you have the, if inside these curly braces, but then, inside the, if you typically are going to render some text to the page, and so intuitively you would indent it so that it's indented in from the if statement. But then if you do that, then it's gonna be shifted to the right on, on the Wiki. Did I get that right? [00:16:53] Megan: Yeah. So you have the flexibility to put white space immediately because PHP will strip immediately, but then you don't have flexibility to put any white space after that, if that makes sense. [00:17:11] Jeremy: So, so when you say immediately, is that on the following line or is that [00:17:15] Megan: yeah, so any white space before the first clause, you have flexibility. So like if you were to put an if statement, so it's like if, and then there's a colon, all of the next white space will get stripped. Um, so then you can put some text, but then, if you wanted to like put some text and then another if statement nested within the first if statement. It's not like Lua where you could like assign a variable and then put a comment and then put some more white space and then put another statement. And it's white space insensitive because you're just writing code and you haven't returned anything yet. it, it's more like Jinja (View templating language) than Python for, for an analogy. So everything is getting printed because you're in like a, this templating language, not actually a programming language. Um, so you have to work as if you're in a templating language about, you know, 70% of the time , unless you're in this like very specific location where PHP is stripping your white space because you're at the edge of an argument that's being sent there. So it's like incredibly inconsistent. And every now and then you get to like, pretend that you're in an actual language and you have some white space, that you can indent or whatever. it's just incredibly inconsistent, which is like what you absolutely want out of a programming language (laughs) yeah, it's like you're, you're writing templates, but like, it seems like because of the fact that it's using php, there's [00:18:56] Jeremy: weird exceptions to the behavior. Yeah. [00:18:59] Megan: Exactly. Yeah. [00:19:01] Jeremy: and then you also mentioned these, these templates. So, if I understand correctly, this is kind of like how a lot of web frameworks will have, partials, I guess, where you'll, you'll be able to have a webpage, but it's made up of different I don't know if you would call them components, but you're able to build a full page that's made up of a bunch of different pieces. So you could have a [00:19:31] Megan: Yeah Yeah that's a good analogy. [00:19:33] Jeremy: Where it's like, here's my table of contents, or here's my info box, or things like that. And those are all things that you would create a MediaWiki template for, and then somehow the, the data gets passed into those templates and the template decides how to, to render it out. [00:19:55] Megan: Yeah. [00:19:56] Jeremy: And for these, these templates, I, I noticed on some of the Leaguepedia pages, I noticed there's some html in some of them. I was curious if that's typical to write them with HTML or if there are different ways native to Media Wiki for, for, creating these templates. [00:20:23] Megan: Um, it depends on what you're doing. MediaWiki has a special syntax for tables specifically. I would say that it's not necessarily recommended to use the special syntax because occasionally you can get things to not work out fantastically if people slightly break things. But it's easier to use it. So if you know that everything's going to work out perfectly, you can use it. and it's a simple shortcut. if you go to the help page about tables on Wikipedia, everything is explained, and not all HTML works, um, for security reasons. So there's like a list of allowed, things that you can use, allowed tags, so you can't put like forms and stuff natively, but there's the widgets extension that you can use and widgets just automatically renders all html that you put inside of a widget. Uh, and then the security layer there is that you have to have a special permission to edit a widget. so, you only give trusted people that permission and then they can put the whatever html they want there. So, we have a few forms on Leaguepedia that are there because I edited, uh, whichever widgets, and then put the widgets into a Lua module and then put the Lua module into a template and then put the template onto the page. I was gonna say, it's not that complicated. It's not as complicated as it sounds, but I guess it really is as complicated as it sounds (laughs) . Um, so, uh, I, I won't say that. I don't know how standard it is on other wikis to use that much html, I guess Leaguepedia is pretty unique in how complicated it is. There aren't that many wikis that do as many things as we did there. but tables are pretty common. I would say like putting divs places to style them, uh, is also pretty common. but beyond that, usually there's not too many HTML elements just because you typically wanna be mobile friendly and it's relatively hard to stay mobile friendly within the bounds of MediaWiki if you're like putting too many elements everywhere. And then also allowing users to put whatever content inside of them that they want. The reason that we were able to get away with it is because despite the fact that we had so many editors, our content was actually pretty limited. Like if there's a bracket, it's only short team names going into it. So, and short team names were like at most five or six characters long, so we don't have to worry about like overflow of team names. Although we designed the brackets to support overflow of team names, and the team names would wrap around and the bracket would not break. And a lot of CSS Magic went into making that work that, we worked really hard on and then did not end up using (laughsz) [00:23:39] Jeremy: Oh no. [00:23:41] Megan: Only short team names go into brackets. But, that's okay. uh, and then for example, like in, uh, schedules and stuff, a lot of fields like only contain numbers or only contain timestamps. there's like a lot of tables again where like there's only two digit numbers with one decimal point and stuff like that. So a lot of the stuff that I was designing, I knew the content was extremely constrained, and if it wasn't then I said, well, too bad. This is how I'm telling you to put the content . Um, and for technical reasons, that's the content that's gonna go here and I don't care. so there's like, A lot of understanding that if I said for technical reasons, this is how we have to do it. Then for technical reasons, that was how we had to do it. And I was very lucky that all of the people that I worked with like had a very big appreciation with like, for technical reasons, like argument over. This is what's happening. And I know that with like different people on staff, like they would not be willing to compromise that way. Um, so I always felt like extremely lucky that like if I couldn't figure out a way to redesign or recode something in order to be more flexible, then like that would just be respected. And that was like how we designed something. But in general, like it's, if you are not working with something as rigid as, I mean, and like the history of eSports sounds like a very fluid thing, but when you think about it, like it's mostly names of teams, names of players and statistics. There's not that much like variable stuff going on with it. It's very easy to put in relational databases. It's very easy to put in fixed width tables. It's very easy to put in like charts that look the same on every single page. I'm not saying. It was always easy to like write everything that I wrote, and it's not, it wasn't always easy to like, deal with designs and stuff, but like relative to other topics that you can pick, it was much easier to put constraints on what was going to go where because everything was very similar across regions, across, although actually one thing. Okay, so this will be like the, the exception that proves the rule. uh, we would trans iterate players' names when we, showed them in team rosters. So, uh, for example, when we were showing the hangul, the Korean player's names, we would show an English translation also. Um, and we would do this for every single alphabet. but Hungarian players' names are really, really, really long. And so the transliteration doesn't fit in the table when we show the translation to the Roman alphabet. And so we couldn't do this, so we actually had to make a cargo table. Of alphabets that are allowed to be transliterated into the Roman alphabet, uh, when we have players names in that alphabet. So we had, like, hangul was allowed and Arabic was allowed, and I can't remember the exact list, but we had like three alphabet, three or four alphabets were allowed and the rest of the alphabets were dis allowed to be transliterate into, uh, the Roman alphabet. and so again, we made up a rule that was like a hard rule across the entire Wiki where we forced the set of alphabets that were transliterated so that this tables could be the same size roughly across every single team page because these Hungarian player names are too long (laughs) So I guess even this exception ended up being part of the rule of everything had to be standardized because these tables were just way too wide and they were running into the info box. They couldn't fit on the side. so it's really hard when you have like arbitrary user entered content to fit it into the HTML that you design. And if you don't have people who all agree to the same standards, I mean, Even when we did have people who agreed to all of the same standards, it was really, really, really hard. And we ended up having things like a table of which alphabets to transliterate. Like that's not the kind of thing that you think you're going to end up having when you say, let's catalog the history of League of Legends eSports, [00:28:40] Jeremy: And, and so when, let's say you had a language that you couldn't trans iterate, what would go into the table. [00:28:49] Megan: uh, just the native alphabet. [00:28:51] Jeremy: Oh I see. Okay. [00:28:53] Megan: Yeah. And then if they went to the player page, then you would be able to see it transliterated. But it wouldn't show up on the team page. [00:29:00] Jeremy: I see. And then to help people visualize what some of these things you're talking about look like when you're talking about a, a bracket, it's, is it kind of like a tree structure where you're showing which teams are facing which teams and okay, [00:29:19] Megan: We had a very cool, CSS grid structure that used like before and after pseudo elements to generate the lines, uh, between the teams and then the teams themselves were the elements of the grid. Um, and it's very cool. Uh, I didn't design it. Um, I have a friend who I very, very fortunately have a friend who's amazing at CSS because I am like mediocre at css and she did all of our CSS for us. And she also like did most of our designs too. Uh, so the Wiki would not be like anything like what it is without her. [00:30:00] Jeremy: And when you're talking about making sure the designs fit on desktop and, and mobile, um, I think when you were talking earlier, you're talking about how you have these, these templates to build these tables and the, these, these brackets. Um, so I guess in which part of the wiki is it ensuring that it looks different or that it fits when you're working with these different screen sizes [00:30:32] Megan: Usually it's a peer CSS solution. Every now and then we hide an element on mobile altogether, and some of that is actually MediaWiki core, for example, in, uh, nav boxes don't show up on mobile. And that's actually on Wikipedia too. Uh, well, I guess, yeah. I mean, being MediaWiki core, So if you've ever noticed the nav boxes that are at the bottom of pages on Wikipedia, just don't show up on like en.m.wikipedia.org. and that way you're not like loading, you're not loading, but display noneing elements on mobile. but for the most part it's pure CSS Solutions. Um, so we use a lot of, uh, display flex to make stuff, uh, appropriate for mobile. Um, some media roles. sometimes we display none stuff for mobile. Uh, we try to avoid that because obviously then mobile users aren't getting like the full content. Occasionally we have like overflow rules, so you're getting scroll bars on mobile and then every now and then we sort of just say, too bad if you're on mobile, you're gonna have not the greatest solution or not the greatest, uh, experience. that's typically for large data tables. so the general belief at fandom was like, if you can't make it a good experience on mobile, don't put it on the Wiki. And I just think that's like the worst philosophy because like then no one gets a good experie. And you're just putting less content on the Wiki so no one gets to enjoy it, and no one gets to like use the content that could exist. So my philosophy has always been like the, the, core overview pages should be, as good as possible for both PC and mobile. And if you have to optimize for one, then you slightly optimize for mobile because the majority of traffic is mobile. but attempt not to optimize for either one and just make it a good experience on both. but then the pages behind that, I say behind because we like have tabs views, so they're like sort of literally behind because it looks like folders sort of, or it looks like the tabs in a folder and you can, like, I, I don't know, it, it looks like it's behind (laughs) , the, the more detailed views where it's just really hard to design for mobile and it's really easy to design for pc and it just feels very unlikely that users on mobile are going to be looking at these pages in depth. And it's the sort of thing. A PC user is much more likely to be looking at, and you're going to have like multiple windows open and you're gonna be tapping between them and you're gonna be doing all of your research at PC. You absolutely optimize this for PC users. Like, what the hell this is? These are like stats pages. It's pages and pages and pages of stats. It's totally fine to optimize this for PC users. And if the option is like, optimized for PC users or don't create it at all, what are you thinking To not create it at all, like make it a good experience for someone? So I don't, I don't understand that philosophy at all. [00:34:06] Jeremy: Did you, um, have any statistics in terms of knowing on these types of pages, these pages that are information dense or have really big tables? Could you tell that? Oh, most of the people coming here are on computers or, or larger screens. [00:34:26] Megan: I didn't have stats for individual pages. Um, mobile I accidentally lost Google Analytics access at some point, and honestly I wasn't interested enough to go through the process of trying to get it back. when I had it, it didn't really affect what I put time into, because it was, it was just so much what I expected it to be. That it, it didn't really affect much. What I actually spent the most time on was looking, so you can, uh, you get URLs for search results. And so I would look through our search results, and I would look at the URL of the failed search results and, so there would be like 45 results for this particular failed search. And then I would turn that into a redirect for what I thought the target was supposed to be. So I would make sure that people's failed searches would actually resolve to the correct thing. So if they're like typo something, then I make the typo actually resolve. So we had a lot of redirects of like common typos, or if they're using the wrong name for a tournament, then I make the wrong name for the tournament resolve. So the analytics were actually really helpful for that. But beyond that, I, I didn't really find it that useful. [00:35:48] Jeremy: And then when you're talking about people searching, are these people using a search box on the Wiki itself And not finding what they were looking for? [00:36:00] Megan: Yeah. So like the internal search, so like if you search Wikipedia for like New York City, but you spell it C I Y T, , then you're not going to get a result. But it might say, did you mean New York City t y? If like 45 people did that in one month, then that would show up for me. And then I don't want them to be getting, like, that's a bad experience. Sure. They're eventually getting there, but I mean, I don't want them to have to spend that extra time. So I'm gonna make an automatic redirect from c Y T to c i t Y [00:36:39] Jeremy: And, and. Maybe we should have talked about this a little earlier, but the, all the information on Leaguepedia is, it's about all of the different matches and players, um, who play League of Legends. so when you edit a, a page on Wikipedia, all of that information, or a lot of it I think is, is hand entered by, by people and on Leagueapedia, which has all this information about like what, how teams did in a tournament or, intricate stats about how a game went. That seems like a lot of information for someone to be hand entering. So I was wondering how much of that information is somebody actually manually editing those things and how much is, is done automatically or programmatically. [00:37:39] Megan: So it's mostly hand entered. We do have a little bit of it that's automated, via a couple scripts, but for the most part it's hand entered. But after being handed, entered into a couple of data pages, it gets propagated a lot of times based on a bunch of Lua modules and the cargo extension. So when I originally joined the Wiki back in 2014, it was hand entered. Not just once, but probably, I don't know, seven times for tournament results and probably 10 or 12 times for roster changes. It was, it was a lot. And starting in 2017, I started rewriting all of the code so that it was entered exactly one time for everything. Tournament results get entered one time into a data page and roster changes get entered one time into a data page. And, for roster changes, that was very difficult because, for a roster change that needs to update the team history on a player page, which goes, from a join to a leave and it needs to update the, the like roster, change portal for the off season, which goes from a leave to a join because it's showing like the deltas over the off season. And it needs to update the current team in the, player's info box, which means that the current team has to be calculated from all of the deltas that have ever occurred in that player's history and it needs to update. Current rosters in the team pages, which means that the team page needs to know all of the current players who are currently on the team, which again, needs to know all of the deltas from all of history because all that you're entering is the roster changes. You're not entering anyone's current team. So nowhere on the wiki does it ever store a current team anymore. It only stores the roster changes. So that was a lot of code to write and deciding even what was going to be entered was a lot because, all I knew was that I was going to single source of truth that somehow and I needed to decide what was I going to single source of truth. So I decided, um, that I was going to be this Delta and then deciding what to do with that, uh, how to store it in a relational database. It was, it was a big project. and I didn't have a background as a developer either. so this was like, I don't know, this was like my third big project ever. So, that was, that was pretty intense. but it was, it was a lot of fun. so it is hand entered but I feel like that's underselling it a little bit. [00:40:52] Jeremy: Yeah, cuz I was initially, I was a little confused when you mentioned how somebody might need to enter the same information multiple times. But, if I understood correctly, it would be if somebody's changing which team they're on, they would have to update, for example, the player's page and say like, oh, this player is on this team now. And then you would have to go to their old team and remove them from the roster there. Go to the new team, add them to the roster there, And you can see where it would kind [00:41:22] Megan: Yeah. And then there's the roster, there's the roster nav box, and there's like the old team, you have to say, like the next team. Cuz in the previous players list, like we show former team members from the old team and you have to say like the next team. Uh, so if they had like already left their old team, you'd have to say like, new team. Yeah, there's a, there's a lot of, a lot of places. [00:41:50] Jeremy: And so now what it sounds like is, I'm not sure this is exactly how it works, but if you go to any location that would need that information, which team is this player on? When you go to that page, for example, if you were to go to, uh, a teams page, then it would make a SQL query to figure out I guess who most recently had a, I forget what you called it, but like a join row maybe, or like a, they, they had the action of joining this team, and now, now there's a row in the database that says they did this. [00:42:30] Megan: it actually looks at the ten-- so I have an in in between table called tenures. And so it looks at the tenures table instead of querying all the way through the joins and leaves table and doing like the whole list of deltas. yeah. So, and it's also cached so you, it doesn't do the SQL query every time that you load the page. So the only time that the SQL queries actually happen is if you do a save on the page. And then otherwise the entire generated HTML of the page is actually cached on the server. So you're, you're not doing that many database queries every time you load the page, so don't worry about that. but there, there can actually be something like a hundred SQL queries sometimes, when you're, saving a page. So it would be absolute murder if you were doing that every time you went to the page. But yeah, it works. Something like that. [00:43:22] Jeremy: Okay, so this, this tenures table is, that's kind of like what's the current state of all these players and where they are, and then. [00:43:33] Megan: Um, the, the tenures table, caches sort of, or I guess the tenure table captures is a better word than caches um, every, join to leave historically from every team. Um, and then I save that for two reasons. The first one is so that I don't have to recompute it, uh, when I'm doing the team's table, because I have to know both the current members and the former members. And then the second reason is also that we have a public api and so people can query that. if they're building tools, like a lot of people use the public api, uh, for various things. And, one person built like, sort of like a six degrees of Kevin Bacon except for League of Legends, uh, using our tenures tables. So, part of the reason that that exists is so that uh, people can use it for whatever projects that they're doing. Cause the join, the join leave table is like pretty unfriendly and I didn't wanna have to really document that for anyone to use. So I made tenures so that that was the table I could document for people to use. [00:44:39] Jeremy: Yeah. That, that's interesting in that, yeah, when you provide an api, then there's so many different things people can do that even if your wiki didn't really need it, they can build their own apps or their own pages built on all this information you've aggregated. [00:44:58] Megan: Yeah. It's nice because then when someone says like, oh, can you build this as a feature request? I can say no, but you can (laughs) [00:45:05] Jeremy: Well you've, you've done the, the hard part for them (laughs) [00:45:09] Megan: Yeah. exactly. [00:45:11] Jeremy: So that's cool. Yeah. that's, that's interesting too about the, the caching because yeah, I guess when you think about a wiki, most of the people who are visiting it are just visiting it to see what's on there. So the, provided that they're not logged in and they don't need anything specific to them. Yeah, you should be able to cache the whole response. It sounds like. [00:45:41] Megan: Yeah. yeah. Caching was actually a nightmare with this in this particular thing. the, the team roster changes, because, so cargo, which I mentioned a couple times is the database extension that we used. Um, and it's basically a SQL wrapper that like, doesn't port 80% of the features that SQL has. so you can create tables and you can query, but you can't make, uh, like sub-select queries. So your queries have to be like very simple. which is good for like most users of MediaWiki because like the average MediaWiki user doesn't have that much coding experience, but if you do have coding experience, then you're like, what, what, what am I doing? I can't, I can't do anything. Um, but it's a very powerful tool, still compared to most of what you could do with Media Wiki without this, basically you're adding a database layer to your software stack, which I mean, I, I, that's what you're doing, (laughs) Um, so you get a huge amount of power from adding cargo to a wiki. Um, in exchange it's, it's very performance. It's like, it's, it, it's resource heavy. uh, it hurts your performance a lot. and if you don't need it, then you shouldn't use it. But frequently you need it when you're doing, difficult or not necessarily difficult, but like intensive things. Um, anytime that you need to pull data from one page to another, you wanna use something like that. Um, So cargo, uh, one of the things that it doesn't do is it doesn't allow you to, uh, set a primary key easily. so you have to like, just like pretend that one row in the table is your primary key, basically. it internally automatically sets one, but it won't be static or it won't be the same every time that you rebuild the table because it rebuilds the table in a random order and it just uses an auto increment primary key. So you set a row in the table to pretend to be your ran, to pretend to be your primary key. But editors don't know what, your editors don't understand anything about primary keys. And you wanna hide this from them completely. Like, you cannot tell an editor, protect this random number, please don't change this. So you have to hide it completely. So if you're making your own auto increment, like an editor cannot know that that exists. Like this is back to when we were talking about like visual editor. This is like, one of the things about making the wiki safe for people is like not exposing them to the internals of like, anything scary like that. So for example, if an editor accidentally reorders two rows and your roster change data like that has to not matter. Because that can't break the entire wiki. They, you can't make an editor like freak out because they just reordered two rows in, in the page. And you can't put like a scary notice somewhere saying, under no circumstances reorder two rows here. Like, that's gonna scare people away. And you wanna be very welcoming and say like, it's impossible to break this page no matter how hard you tried. Don't worry. Anything you do, we can just fix it. Don't worry. But the thing is that everything's going to be cached. And so in particular, um, when I said I made that tenures table, one thing I did not wanna do was resave every single row from the join leave table. So you had to join back to, sorry, I'm going to use, join in two different connotations. you had to join back to the join leave table in order to get like all of the auxiliary data, like all of the extra columns, like, I don't know, like role, date, team name and stuff. Because otherwise the tenures table would've had like 50 columns or something. So I needed to store the fake primary key in the tenures table, but the tenures table is cached on the player page and the join leave table is on the data page, which means that I need to purge the cache on the player page anytime that someone edits the data on the data page. Which means that, so there's like some JavaScript that does that, but if someone like changes the order of the lines, then that primary key is going to change because I have an auto increment going on. And so I had to like very, very carefully pick a primary key here so that it was literally impossible for any kind of order change to affect what the primary key was so that the cash on the player page wasn't going to be changed by anything that the editor did in unless they were going to then update the cash on that player page after making that change. If that makes sense. So after an editor makes a change on the news page, they're going to press a button to update the cache on the player page, but they're only going to update the player page for the one line that they change on the news page. These, uh, primary keys had to be like super invariant for accidental row moves, or also later on, like entire moves of separating a bunch of these data pages into like separate subpages because the pages were getting too big and it was like timing out the server because there were too many stores to the database on a single page every time you save the page. And anyway, it took me like five iterations of making the primary key like more and more specific to the single line because my auto increment was like originally including every single line I was auto incrementing and then I auto incremented only when that single player was was involved. And then I auto incremented only when that player and the team was involved. And then I reset the auto increment for that date. So, and it was just got like more and more convoluted what my primary key was. It was, it was a mess. Anyway, this is just like another thing when you're working with volunteers who don't know what's going on and they're editing the page and they can contribute content, you have to code for the editor and not code for like minimizing complexity, The editor's experience matters more than the cleanliness of your code base, and you just end up with these like absolute messes that make no sense whatsoever because the editor's experience matters and you always have to code to the editor. And Media Wiki is all about community, and the editor just becomes part of the software and part of the consideration of your code base, and it's very, very different from any other kind of development because they're like, the UX is just built so deeply into how you're developing. [00:53:33] Jeremy: if I am following correctly, when I, when I think of using SQL when you were first talking about cargo and you were talking about how you make your own tables, and I'm envisioning the, the columns and the rows and, it's very common for the primary key to either be auto incrementing or some kind of GUID But then if I understood correctly, I think what you were saying is that anytime an editor makes changes to the data, it regenerates the whole table. Is that did I get that right? [00:54:11] Megan: It regenerates all of the rows on that page. [00:54:14] Jeremy: and when you talk about this, these data pages, there's some kind of media wiki or cargo specific markup where people are filling in what is going to go into the rows. And the actual primary key that's in MySQL is not exposed anywhere when they're editing the data. [00:54:42] Megan: That's right [00:54:44] Jeremy: And so when you're talking about trying to come up with a primary key, um, I'm trying to, I guess I'm trying to picture [00:54:57] Megan: So usually I do page name underscore an auto increment. But then if people can rearrange the rows which they do because they wanna get the rows chronological, but some people just put it at the top of the page and then other people are like, oh my God, it's not chronological. And then they fix it and then other people are like, oh my God, you messed up the time zone. And then they rearrange it again. Then, I mean, normally I wouldn't care because I don't really care like what the primary key is. I just care that it exists. But then because I have it cached on these player pages, I really, really do care what the primary key is. And because I need the primary key to actually agree with what it is on the data page, because I'm actually joining these things together. and people aren't going to be updating the cache on the player page if they don't think that they edited the row because rearranging isn't actually editing and people aren't going to realize that. And again, this is burden of knowledge. People can't, I can't make them know that because they have to feel safe to make any edits. It's bad enough that they have to know that they have to click this button to update the cache after making an edit in the first place. so, the auto increment isn't enough, so it has to be like an auto increment, but only within the set of rows that incorporate that one player. And then rearranging is pretty safe because they'd have to rearrange two pieces of news, including the same player. And that's really unlikely to happen. It's really unlikely that someone's going to flip the order of two pieces of news that involve the same player without realizing that they're actually are editing that single player except maybe they are. So then I include the team in that also. So they'd have to rearrange two pieces of news, including the same player and the same team. And that's like unlikely to happen in the first place. And then like, maybe a mistake happens like once a year. And at the end of the day, the thing that really saves us is that we're a wiki. We're not an official source. And so if we have a mistake once a year, like no one cares really. So we're not going for like five nines or anything. We're going for like, you know, two (laughs) . Um, so [00:57:28] Jeremy: so [00:57:28] Megan: We were having like mistakes constantly until I added player and team and date to the set of things that I was auto incrementing against. and once I got all of those, it was pretty stable. [00:57:42] Jeremy: And for the caching part, so when you're making a cargo query or a SQL query on one page and it needs to join on or get data from another page, it goes to this cache that you have instead of going directly to the actual table in the database. And the only way to get the right data is for the editor to click this button on the website that tells it to update the cache did I get that right? [00:58:23] Megan: Not quite. So it, well, or Yes, you did sort of, it goes to the actual table. The issue here is that, the table was last updated, the last time that a page was saved. And the last time the data got saved was the last time that the page that contains the parser function that generates those rows got saved. So, let me say that again. So, some of the data is being saved from the data page where the users manually enter it, and that's fine because the only time that gets updated is when the users manually enter it and then the page gets saved. But then these tenures tables are stored by my lua code on the player pages, and those aren't going to get updated unless the player page gets blank edited or null edited, or a save action happens from the player page. And so the way to make a, an edit happen from the player page is either to manually go there and click edit, and then click save, which is called a blank edit because. Blank edited, you didn't do anything but you pressed save or to use my JavaScript gadget, which is clicking a button from the data page that just basically does that for you using the api. And then that's going to update the table and then the database table, because that's where the, the cargo parser function is that writes to the database and updates the tables there. with the information, Hey, the primary key changed, because that's where the parser function is physically located in the wiki because one of them is on the data page and one of them is on the player page. So you get this disconnect in the cache where it's on two different pages and so you have to press a save action in both of them before the table is consistent again. [01:00:31] Jeremy: Okay. It be, it's, so this is really all about the tenure table, which the user will never mod or the editor will never modify directly. You need your code running on the data page and the player's page to run, to update the The tenure table? [01:00:55] Megan: Yeah, exactly. [01:00:57] Jeremy: yeah, it's totally hidden that this exists to the editor, but it's something that, that you as the person who put this all together, um, have to always be aware of, yeah. [01:01:11] Megan: Right. So there was just so many things like this, where you just had to press this one button. I call it refresh overview because originally it was on a tournament page and you had to press, the refresh overview button to purge the cache on the overview page of the tournament. after editing the data and you would refresh, overview, to deal with this cache lag. And everyone knew you have to refresh overview, otherwise none of your data entry is gonna like, be worth anything because it's not, the cache is just gonna lag. but every editor learned, like if there's a refresh overview button, make sure you press the refresh overview button, , otherwise nothing's gonna happen. Um, and there is just like tons of these littered across the Wiki. and like to most people, it just like, looks like a simple little button, but like so many things happen when you press this button. so it is, it is very important. [01:02:10] Jeremy: Are there, no ways inside of media wiki to if somebody edits one page, for example, to force it to go and, do, I forget what you called it, like a blank save or blank edit on another page? [01:02:27] Megan: So that wouldn't even really work because, we had 11,000 player pages. And you don't know which one the user just edited. so it, it's unclear to MediaWiki what just happened when the user just edited some part of the data page. and like the whole point here is that I can't even blank edit every single player page that the data page links to because the data page probably links to, I don't know, 200 different player pages. So I wanna link, I wanna blank it like the five that this one news line links to. so I do that, through like HTML attributes, in the JavaScript, [01:03:14] Jeremy: Oh, so that's why you're using JavaScript so that you can tell what the person edited because there isn't really a way to know natively in, in MediaWiki. what just changed? [01:03:30] Megan: there's like a diff so I could, like, MediaWiki knows the characters that got changed, but it doesn't really know like semantically what happened. So it doesn't know, like, oh, a link to this just got edited and especially because, I mean it's like templates that got edited, not really like the final HTML or anything. So Media Wiki has no idea what's going on. so yeah, so the JavaScript, uh, looks at the HTML attributes and then runs a couple API queries, and then the blank edits happen and then a couple purges after that so that the cache gets purged after the blank edit. [01:04:08] Jeremy: Yeah. So it, it seems like on these Wiki pages, you have the html, you have the CSS you have the ability to describe these data pages, which I, I guess in the end, end up being rows in in SQL. And then finally you have JavaScript. So it kind of seems like you can do almost everything in the context of a a Wiki page. You have so many, so many of these tools at your, at your disposal. [01:04:45] Megan: Yeah. Except write es6 code. [01:04:48] Jeremy: Oh, still, still only es5. [01:04:52] Megan: Yeah, [01:04:52] Jeremy: Oh no. do, do you know if that's something that they are considering changing or [01:05:01] Megan: There's a Phabricator ticket open. [01:05:05] Jeremy: How, um, how, how many years? [01:05:06] Megan: It has a lot of comments, oh a lot of years. I think it's since like 2014 or something [01:05:14] Jeremy: Oh yeah. I, I guess the, the one maybe, well now now the browsers all, all support es6, but I, I guess one of the things, it sounds like media wiki, maybe side stepped is the whole, front end ecosystem in, in terms of node packages and build tools and things like that. is, is that right? It's basically you can write JavaScript and there, yeah, [01:05:47] Megan: You can even write jQuery. [01:05:49] Jeremy: Oh, okay. That's built in as well. [01:05:52] Megan: Yeah .So I have to admit, like my, my front end knowledge is like a decade out of date or something because it's like what MediaWiki can do and there's like this entire ecosystem out there that I just like, don't have access to. And so I like barely know about. So I have this like side project that uses React that I've like, kind of sort of been working on. And so like I know this tiny little bit of react and I'm like, why? Why doesn't MediaWiki do this? Um, they are adding Vue support. So in theory I'll get to learn vue so that'll be fun. [01:06:38] Jeremy: So I'm, I'm curious, just from the limited experience you've had, outside of, MediaWiki, are, are there like specific things, uh, in your experience working with React where you're, you really wish you had in inside of Media Wiki? [01:06:55] Megan: Well, really the big thing is like es6, like I really wish we could use arrow functions , like that would be fantastic. Being able to build components would be really nice. Yeah, we can't do that. [01:07:09] Jeremy: I, I suppose you, you've touched a little bit on performance before, but I, I guess that's one thing about Wikis is that, putting what's happening in the back end, aside the, the front end experience of Wikis, they, they feel pretty consistent since they're generally mostly server rendered. And the actual JavaScript is, is pretty light, at least from, from Wikis I've seen. [01:07:40] Megan: Yeah. I mean you can add as much JavaScript as you want, so I guess it depends on what the users decide to do. But it's, it's definitely true that wikis tend to load faster than some websites that I've seen. [01:07:54] Jeremy: Yeah, I mean, I guess when you think of a wiki, it's, you're there cuz you wanna get specific information and so the goal is not to necessarily reproduce like some crazy complex app or something. It's, It's, to get you the, the, information. Yeah. [01:08:14] Megan: Yeah. No, that's actually one thing that I really like about Wikis also is that you don't have the pressure to make them look nice. I know that some people are gonna hear that and just like, totally cringe and be like, oh my God, what is she saying? ? Um, but it's actually really true. Like there's an aesthetic that Wikis and Media Wiki in particular have, and you kind of stick to that. And within that aesthetic, I mean, you make them look as nice as you can. Um, and you certainly don't wanna like, make them deliberately ugly, but there's not a pressure to like go over the top with like marketing and branding and like, you know, you, you just make them look reasonably nice. And then the focus is on the information and the focus is on making the information as easy to understand as possible. And a wiki that looks really nice is a wiki that's very understandable and very intuitive, and one where you. I mean, one, that the information is the joy and, you know, not, not the presentation, I guess. So it's like the presentation of the information instead of the presentation of the brand. so I, I really appreciate that about wikis. [01:09:30] Jeremy: Yeah, that's a good point about the aesthetics in the sense of like, they have a certain look and yeah, maybe it's an authoritative look, , which, uh, is interesting cuz it's, like a, a wiki that I'll, I'll commonly go to for example, is there's the, the PC gaming Wiki. And when you look at how it's styled, it feels like very dated or it doesn't look like, I guess you could say normal webpages, but it's very much in line with what you expect a wiki to look like. So it's, it's interesting how they have that, shared aesthetic, I guess. [01:10:13] Megan: Yeah. yeah. No, I really like it. The Wiki experience, [01:10:18] Jeremy: We, we kind of touched on this near the beginning, but sometimes when. I would see wikis and, and projects like Leaguepedia I would kind of wonder, you know, what's the decision between or behind it being a wiki versus something being like a custom CMS in, in the case of Leaguepedia but, you know, talking to you about how it's so, like wikis are structured so that people can contribute. and then like you were saying, you have like this consistent look that brings the data to the user. Um, I actually, it gives me a better understanding of why so many people choose wikis as, as ways to present this information. [01:11:07] Megan: Yeah, a a lot of people have asked me over the years why, why MediaWiki when it always feels like I'm jumping through so many hoops. Um, I mean, when I just described the caching thing to you, and that's just like one of, I don't know, dozens of struggles that I've had where, MediaWiki has gotten in the way of what I need to do. Because really Leaguepedia is an entire software layer on top of MediaWiki, and so you might ask why. Why MediaWiki? Why not just build the software layer on top of something easier? And my answer is always, it's about the community. MediaWiki lends itself so well to community and people enjoy contributing to wikis and wikis. Wikis are just kind of synonymous with community, and they always have been. And Wikipedia sort of set the example when they launched, and it's sort of always been that way. And, you know, I feel like I'm a part of a community when I say a Wiki. And if it was just if it were a custom site that had the ability to contribute to it, you know, it just feels like it's not the same. [01:12:33] Jeremy: I think just even seeing the edit button on Wikis is such a different experience than having the expectation, well, I guess in the case of Leaguepedia, you do have to create an account, but even without creating the account, you can still click edit and you can look at the source and you can see how all this information, or a lot of it, how it got filled in. And I feel like it's kind of more similar to the earlier days of webpages where people could right click a site and click view source and then look at the HTML and the css, and kind of see how it was put together. versus, now with a lot of sites, the, the code has been minified or there's build tools involved so that when you look at view source on websites, it just looks crazy and you're not sure what's going on. So I, I, I feel like wikis in some ways are, kind of closer to the, the spirit of, like the earlier H T M L sites. Yeah. [01:13:46] Megan: And the knowledge transfers too. If you've edit, if you've, if you've ever edited Wikipedia, then you know that like open bracket, open bracket, closed bracket. Closed bracket is how you link a page. and that knowledge transfers to admittedly maybe a little bit less so for Leaguepedia, since there, you need to know how all the templates work and there's not so much direct source editing. it's mostly like clicking the CharInsert prefills. but there's still a lot of cross knowledge transfer, if you've edited one wiki and then change to editing another. And then it goes the other way too. If you edit Leaguepedia, then you want to go at it for the Zelda Wiki, that knowledge will transfer. [01:14:38] Jeremy: And, and talking about the community and the editors. I, I imagine on Wikipedia, most of the people editing are volunteers. Is it the same with Leaguepedia in your experience? [01:14:55] Megan: Um, yeah, so I was contracted, uh, or I was not contracted. My LLC was contract and then I subcontracted. Um, it changed a bit over the years, um, as people left. Uh, so at first I subcontracted quite a few people. Um, and then I guess, as you can imagine, as, there was a lot more data entry that had to be done at the start. And less had to be done later on, as I, expanded the code base so that it was more a single source of truth, and less stuff had to be duplicated. And I guess it was, it probably became a lot more fun too, uh, when you didn't have to edit, enter the same thing multiple times. but, uh, a bunch of people, uh, moved on over the years. and so by the end I was only subcontracting, three people. Um, and everyone else was volunteer. [01:15:55] Jeremy: And and the people that you were subcontracting, that was for only data entry, or was that also for the actual code? [01:16:05] Megan: No, that wasn't for data entry at all. Um, and actually that was for all of my wikis, uh, because I was. Managing like all of the eSports wikis. or one of them was for Call of Duty and Halo, uh, to manage those wikis. One of them was for, uh, just the Call of Duty Wiki. and then one of them was for Leaguepedia to do staff onboarding. Oh [01:16:28] Jeremy: okay. So this is, um, this is to help people contribute to all of these wikis. That's, that's what these, these, uh, subcontractors we're focusing on. [01:16:41] Megan: Yeah, [01:16:44] Jeremy: I guess that, that makes sense when we've been talking about the complexity, uh, what's behind Leaguepedia, but there's a lot that the editors, it sounds like, have to learn as well to be able to know basically where to go and how to fill everything out and Yeah. [01:17:08] Megan: So basically, for the major leagues, in League of Legends, um, we required some onboarding before you could cover them because we wanted results entered within like, about one to four minutes. of the game centering, or sorry, of the games ending. Um, so that was like for North America, Korea, China, Europe, and for the, like for some regions, like the really minor ones, like second tier leagues in, like for example the national leagues in Europe, second tier or something, we kind of didn't really care if it was entered immediately. And so anyone who wanted to enter could just enter, uh, information. So we did want the experience to be easy enough that people could figure it out on their own. and we didn't really, uh, require onboarding for that. There was like a gradation of how much onboarding we required. But typically we tried to give training as much as we could. Um, it, it was sort of dependent on how fast people expected the results and how available someone was to provide training. so like for Latin America, there was like a lot of people who were available to provide trainings. So even like the more minor leagues, people got training there. for example, But yeah, it was, it was very collaborative. and a lot of people, a lot of people got involved, so, yeah. [01:18:50] Jeremy: And in terms of having this expectation of having the results in, in just a few minutes and things like that, is it, where are, are these people volunteers where they would volunteer for a slot and then there was just this expectation? Or how did that work? work [01:19:09] Megan: Yeah. So, um, a lot of people volunteered with us as resume experience to try and get jobs in eSports. Um, and some people just volunteered with us because they wanted to give back to the community because, we're like a really valuable resource for the community. And I mean, without volunteer contribution we wouldn't have existed. So it was like understood that we needed people's help in order to continue existing. So some people, volunteered for that reason. Some people just found it fun to help out. so there's like a range of reasons to contribute. [01:19:46] Jeremy: And, and you were talking about how there's some people who they, they really need this data in, in that short time span. you know, who, who are we talking about here? Are these like commentators? Are these journalists? I'm just curious who's, who's, looking for this in such a short time span [01:20:06] Megan: Well, fans would look for the data immediately. sometimes if we entered a wrong result, someone would like come into our discord and be like, Hey, the result of this is wrong. you know, within seconds of the wrong result going up. So we knew that people were like looking at the Wiki, like immediately. But everyone used the data, commentators at Riot. journalists. Fans, yeah. like everyone is using it. [01:20:33] Jeremy: and since it's so important to, like you're mentioning Riot or the tournament organizers, things like that. What kind of relationship do you have with them? Do they provide any kind of support or is it mostly just, it's something they just use [01:20:54] Megan: I, so there is, um, I definitely talk to people at Riot pretty regularly. and we. we got like resources from them, so, they'd give us player photos to put up, and like answers to questions and stuff. but for the most part it was just something that they'd use. [01:21:15] Jeremy: and, and so like now that unfortunately your, your contract wasn't renewed with Leaguepedia like where do you, I guess see the, the future of Leaguepedia but, but also all these other eSports wikis going, is this something that's gonna be more just community driven or, I'm, I guess I'm trying to understand, you know, how this, the gap gets filled. [01:21:47] Megan: Yeah, I'm, I'm not sure. Um, they're doing an update to Media Wiki 1.39 next year. we'll see if stuff majorly breaks during that. probably no one's gonna be able to fix it if it does. Who knows? (laughs) um, yeah, I don't know. There's another site that hosts, uh, eSports wikis called Liquipedia um, so it's possible that they'll adopt some of the smaller wikis. Um, I think it's pretty unlikely that they'll want to take Leaguepedia, um, just because it's too complicated of a wiki. but yeah, I, I, I don't know. [01:22:31] Jeremy: it kind of feels like one of these things where I guess whoever is in charge of making these decisions may not fully understand the implications or, or what it takes to, to run such a, a large wiki. yeah, I guess it'll be interesting to, to see if it ends up being like you said, one, one big mess. [01:22:58] Megan: Yeah. I got them through the 1.37 upgrade by submitting like three or four patches to cargo, during that time and discovering that the patches needed to be made prior to the upgrade happening. So, you know, I don't think that they're going to update cargo during the 1.39 upgrade and it's cargo changes that have the biggest disruption. So they're probably safe from that. and, and I don't think 1.39 has any big parser changes. I think that's later, but yeah, there'll probably still be like a bunch of CSS changes and who knows if anyone's going to fix the follow up from that. So, yeah, we'll see. [01:23:46] Jeremy: Yeah, that's, um, that's kind of interesting to know too that, these upgrades to MediaWiki and, and to extensions like cargo, that they change so significantly that they require pull requests. Is that, like, is that pretty common in terms of when you do an upgrade of a MediaWiki that there there are these individual things you need to do and it's not just update package. [01:24:18] Megan: well the cargo change was the first time that we had upgraded in like two and a half years or something. so that one in particular, I think it was expected that that one wasn't going to go so smoothly. generally updates go not that badly. I say with rising intonation, (laughs) , um, if you keep up to date with stuff, it's generally pretty okay. Cargo is probably one of the less stable ones just because it's a relatively small contributor base, and so kind of crazy things happen sometimes. Um, Semantic Media Wiki is a lot more stable. Uh, but then the downside is that if you have a feature request for SMW it's harder to get pushed through. But cargo still changes a lot. The big change with cargo, like the big problematic change with cargo was a tiny bug fix that just so happened to change every empty string value to nil in Lua, You know, no big deal or anything, whatever. [01:25:42] Jeremy: That, that's, uh, that's a good one right there. [01:25:47] Megan: I mean, I I don't know how no one noticed this for like a year and a half or something man, It was a tiny bug fix. [01:26:02] Jeremy: Mm. [01:26:03] Megan: Like it was checked in as a bug fix and it really was a bug fix. I tracked down the guy who made the patch and I was like, I can't reproduce this bug. Can I just revert it? And he was like, I can't reproduce it either. [01:26:21] Jeremy: Oh, wow. (laughs) [01:26:23] Megan: And I was like, well, that's great. And I ended up just leaving it in, but then changing them back to empty string. Um, when the extension was first released, null database values were sent to Lua as empty string due to a bug in the first place. Because null databases, null database values should just be nil in Lula. Like, come on here, . But they were sent as empty string. And so for like five years, people were writing code, assuming that you would never get a nil value for anything that you requested from the database. So you can't make a breaking change like that without putting a config value that defaults to true. [01:27:10] Jeremy: Yeah. [01:27:11] Megan: So I added a legacy, nil value, legacy Lua, nil value as empty string config value or something, and, defaulted it to true and wrote in the documentation that it was recommended that you set it to false. Or maybe I defaulted it to false. I, I don't remember what I set the default to, but I wrote in the documentation something about how you should, if possible, set this to false, but if you have a large code base, you probably need this . And then we set up Platform Ride to True, and that's the story of how I saved the shit out of our 1.37 upgrade this year. [01:27:57] Jeremy: Oh yeah, that's, um, that's a rough one. Changing, changing people's data is very scary. [01:28:05] Megan: Yeah, I mean, it was totally unintended. and I don't know how no one noticed this either. I mean, I guess the answer is that not very many people do the kind of stuff that I do working with Lua and Cargo in this much depth. but a fairly significant number of fandom Wikis do, and this would've just been an absolute disaster. And the semi ironic thing is that, I, I have a wrapper that fixes the initial cargo bug where I detect every empty string value and then cast it to nil after I get my data from cargo. So I would've been completely unaffected by this. And my wiki was the primary testing wiki for cargo on the 1.37 patch. So we wouldn't have caught this, it would've gone to live [01:28:56] Jeremy: Wow. [01:28:58] Megan: So we got extremely lucky that I found out about this ahead of time prior to us QAing and fixed this bug because it would've gone straight to live. [01:29:10] Jeremy: that's wild yeah, it's just like kind of catastrophic, right? It's like, if it happens, I feel like whoever is managing the wikis is gonna be very confused. Like, why, why is everything broken? I don't, I don't understand. [01:29:25] Megan: Right? And this is like so much broken stuff that it's like very difficult to track down what's going on. I actually had a lot of trouble figuring out what was wrong in the code base. Causing this error. And I submitted an incorrect patch at first, and then the incorrect patch got merged, and then I had to like roll back the incorrect patch. And then I got a merge conflict on the incorrect patch. And it, it was, it was bad. It took me three patches to get this right. Um, But eventually, eventually I got there. [01:30:02] Jeremy: Yeah. that's software, I guess , [01:30:06] Megan: Yeah. [01:30:07] Jeremy: the, the, the thing you were trying to avoid all these years. [01:30:10] Megan: Yeah, [01:30:13] Jeremy: you're in it now. [01:30:14] Megan: It really was, that was actually the reason that I went in, I got into the Wiki in the first place, um, and into e-sports. Uh, was that after Caltech, I wanted to like get away from STEM altogether. I was like, I've had enough of this. Caltech was too much, get me away, (laughs) . And I wanted to do like event management or tournament organization or something. And so I wanted to work in eSports. and that was like my life plan. And I wanted nothing to do with STEM and I didn't wanna work in software. I didn't wanna do math. I was a math major. I didn't wanna do math. I didn't wanna go to grad school. I wanted absolutely nothing to do with this. So that was my plan. And somehow I stumbled and ended up in software. [01:31:02] Jeremy: Well, at least you got the eSports part. [01:31:05] Megan: Yeah, so that, that worked out. And really for the first couple of years I was doing like community management and social media and stuff. Um, and I did stay away from software for about the first two years, so it lasted about two whole years. [01:31:24] Jeremy: What ended up pulling you in? [01:31:26] Megan: Um, actually, so when, when I signed back with Gamepedia, our developer just sort of disappeared and I was like, well, shit, I guess that's me now. (laughs) So we had someone else writing all of our templates for a couple years, so I was able to just like make a lot of feature requests. and I'm very good at making feature requests. If, if I ever have like, access to someone else who's writing code for me, I'm like, fantastic at just making a ton of like really minor feature requests, and just like taking off all of their time with like a billion tiny QA issues. [01:32:09] Jeremy: You you are the backlog, [01:32:12] Megan: Yeah, I really, um, I, there's another OSS project that I've been working on, um, which is a Discord bot and. We, our, our backlog just expands and expands and [01:32:26] Jeremy: Oh yeah. You know what, I, I think I did look at that. I, I looked at the issues and, usually when you look at a, the issues for an open source project, it's, it's all these people using it, right? That are like, uh, there's this thing I want, but then I looked and it was all, it was all you. So I guess that's okay cuz you're, you're in the position to do something about it. [01:32:47] Megan: The, the part that you don't know is that I'm like constantly begging other people to open tickets too. [01:32:53] Jeremy: Really? [01:32:55] Megan: Yeah. Like constantly. I'm like, guys, it can't just be me opening tickets all the time. [01:33:04] Jeremy: Yeah. Yeah. If it was, if it was someone else's project, I would be like, oh, this is, uh, . I don't know about this. But when it's your own, you know, okay. It's, it's basically like, um, it's like a roadmap I guess. [01:33:20] Megan: Yeah. Some of them have been open for, for quite a long time, but actually a couple months ago we closed one that had been open since, I think like April, 2020. [01:33:31] Jeremy: Oh, nice. [01:33:32] Megan: That was quite an event. [01:33:34] Jeremy: Yeah, it's open source, So you can do whatever you want, right. (laughs) [01:33:41] Megan: We even have a couple good first issues that are actually good first issues. [01:33:46] Jeremy: Yeah. Not, not getting any takers? [01:33:49] Megan: No, we sometimes do. Yeah. I actually, we, so some of them are like semi-important features, but I like feel really bad if I ever do the good first issues myself because like somewhere else could do them. And so like, if it's like a one line ticket, I would just, I feel so much guilt for doing it myself. [01:34:09] Jeremy: Oh, I see what you mean. [01:34:10] Megan: I'm like, Yeah. so I just like, I can't do them. But then I'm like, oh, but this is really important. But then I'm like, oh, but we might get someone else who, and I just, I never know if I should just take the plunge and do it myself, so. [01:34:22] Jeremy: yeah. No, that's, that's a good point. It's, it's like, like these opportunities, right. For people to, and it could, it could make a big difference for them. And then for you, it's like, I could do this in 10 minutes or whatever. , Uh, I, I guess it all depends on how annoyed you are by the thing not being there, [01:34:43] Megan: Right. I know because my entire background is like community and getting new people to onboard and like the potential new contributor is worth like 10 times, like, The one PR that I can make. So I should just like absolutely leave it open for the next year. [01:35:02] Jeremy: Yeah. Yeah, no, that's a, that's a good way of, of looking at it. I mean, I I think when you talk about open source or, or even wikis, that that sort of community aspect is, is so, so important, right? Because if it's just, if it's just one person, then I mean, it kind of, it lives or dies with the one person, right? It, it's, it's so different when you actually get a bunch of people involved. And I think that's something like a lot of, a lot of projects struggle with [01:35:38] Megan: Yeah. That's actually, as much as I'm like bitter about the fact that I was let go from my own project, I think the thing that I should, in a sense be the most proud of is that I grew my project to a place where that was able to happen in a sense. Like, I built this and I built it to a place where it was sustainable. Although, we'll see how sustainable it was, (laughs) . but like I'm not needed for the day to day. and that means that like I successfully built a community. [01:36:18] Jeremy: Yeah, no, you should be really proud about that because it's, it's not only like the, the code, right? Like over the years it sounds like you gradually made it easier and easier to contribute, but then also being able to get all these volunteers together and build a community on the discord and, and elsewhere. Yeah, no, I think that's, I think that's really great to be able to, to do, do something like that. [01:36:50] Megan: Thanks. [01:36:53] Jeremy: I think that's, that's a good place to, to wrap up, but is there anything else you wanted to, to mention or do you want to tell people where to check out, uh, what you're up to? [01:37:05] Megan: Yeah, I, I have a blog that's a little bit inactive for the past couple months, because I recently had surgery, but I, I've been saying for like five weeks that I will start, posting there again. So hopefully that happens soon. Uh, but it's river.me, and so you can check that out. [01:37:27] Jeremy: Cool. Well, yeah, Megan, I just wanna say thanks for, for taking the time. This was, this was really interesting. the world of wikis is like this, it's like a really big part of the internet that, um, I use wikis, but I, I've never really understood kind of what's going on in, in terms of the actual technology and the community. so so thank you for, for sharing that. [01:37:53] Megan: Yeah. Thanks so much for having me.

Jan 2, 2023 • 1h 51min

Victor Adossi on Yak Shaving

Victor is a software consultant in Tokyo who describes himself as a yak shaver. He writes on his blog at vadosware and curates Awesome F/OSS, a mailing list of open source products. He's also a contributor to the Open Core Ventures blog. Before our conversation Victor wrote a structured summary of how he works on projects. I recommend checking that out in addition to the episode. Topics covered: Most people should use Dokku or CapRover But he uses Kubernetes anyways Hosting a Database in Kubernetes Learning technology You don't really know a thing until something goes wrong History of Frontend Development Context from lower layers of the stack and historical projects Good project pages have comparisons to other products Choosing technologies Language choice affects maintainability Knowing an ecosystem Victor's preferred stack Technology bake offs Posting findings means you get free corrections Why people use medium instead of personal sites Victor VADOSWARE - Blog How Victor works on Projects - Companion post for this episode Awesome FOSS - Curated list of OSS projects NimbusWS - Hosted OSS built on top of budget cloud providers Unvalidated Ideas - Startup ideas for side project inspiration PodcastSaver - Podcast index that allows you to choose Postgres or MeiliSearch and compare performance and results of each Victor's preferred stack Docker - Containers Kubernetes - Container provisioning (Though at the beginning of the episode he suggests Dokku for single server or CapRover for multiple) TypeScript - JavaScript with syntax for types. Victor's default choice. Rust - Language he uses if doing embedded work, performance is critical, or more correctness is desired Haskell - Language he uses if correctness and type system is the most important for the project Postgresql - General purpose database that's good enough for most use cases including full text search. KeyDB - Redis compatible database for caching. Acquired by Snap and then made open source. Victor uses it over Redis because it is multi threaded and supports flash storage without a Redis Enterprise license. Pulumi - Provision infrastructure with the languages you're already using instead of a specialized one or YAML Svelte and SvelteKit - Preferred frontend stack. Previously used Nuxt. Search engines Postgres Full Text Search vs the rest Optimizing Postgres Text Search with Trigrams OpenSearch - Amazon's fork of Elasticsearch typesense meilisearch sonic Quickwit JavaScript build tools Babel SWC Webpack esbuild parcel Vite Turbopack JavaScript frameworks React Vue Svelte Ember Frameworks built on top of frameworks Next - React Nuxt - Vue SvelteKit - Svelte Astro - Multiple Historical JavaScript tools and frameworks Underscore jQuery MooTools Backbone AngularJS Knockout Aurelia GWT Bower - Frontend package manager Grunt - Task runner Gulp - Task runner Related Links Dokku - Open source single-host alternative to Heroku Cloud Native Buildpacks - Buildpacks created by Heroku and Pivotal and used by Dokku CapRover - An open source PaaS-like abstraction built on top of Docker Swarm Kelsey Hightower's tweet about being cautious about running databases on Kubernetes Settling the Myth of Transparent HugePages for Databases Kubernetes Container Storage Interface (CSI) Kubernetes Local Persistent Volumes Longhorn - Distributed block storage for Kubernetes Postgres docs Postgres TOAST Everything I've seen on optimizing Postgres on ZFS Kubernetes Workload Resources Kubernetes Network Plugins Kubernetes Ingress Traefik Kubernetes the Hard Way (Setting up a cluster in a way that optimizes for learning) How does TLS work Let's Encrypt Cert manager for Kubernetes Choose Boring Technology A Linux user's guide to Logical Volume Management Docker networking overview Kubernetes Scheduler Tauri - Build desktop applications with web technology and Rust ripgrep - CLI tool to recursively search directory for a regex pattern (Meant to be a rust replacement for grep) angle-grinder / ag - CLI tool to parse and process log files written in rust Object.observe ECMAScript Proposal to be Withdrawn Ruby on Rails - Ruby web framework Django - Python web framework Laravel - PHP web framework Adonis - JavaScript NestJS - JavaScript What is a NullPointerException, and how do I fix it? Mastodon Clap - CLI argument parser for Rust AWS CDK - Provision AWS infrastructure using programming languages Terraform - Provision infrastructure with terraform language URL canonicalization of duplicate pages and the use of the canonical tag - Used by dev.to to send google traffic to the original blogpost instead of dev.to Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: This episode, I talk to Victor Adossi who describes himself as a yak shaver. Someone who likes trying a whole bunch of different technologies, seeing the different options. We talk about what he uses, the evolution of front end development, and his various projects. Talking to just different people it's always good to get where they're coming from because something that works for Google at their scale is going to be different than what you're doing with one of your smaller projects. [00:00:31] Victor: Yeah, the context. Of course in direct conflict with that statement, I definitely use Google technology despite not needing to at all right? Like, you know, 99% of people who are doing like people like to call it indiehacking or building small products could probably get by with just Dokku. If you know Dokku or like CapRover. Are two projects that'll be like, Oh, you can just push your code here, we'll build it up like a little mini Heroku PaaS thing and just go on one big server, right? Like 99% of the people could just use that. But of course I'm not doing that. So I'm a bit of a hypocrite in that sense. I know what I should be doing, but I'm not doing that. I am writing a Kubernetes cluster with like five nodes for no reason. Uh, yeah, I dunno, people don't normally count the controllers. [00:01:24] Jeremy: Dokku and CapRover, I think those are where it's supposed to create a heroku like experience I think it's based off of the heroku buildpacks right? At least Dokku is? [00:01:36] Victor: Yeah Buildpacks has actually been spun out into like a community thing so like pivotal and heroku, it's like buildpacks.io, they're trying to build a wider standard around it so that more people can get involved. And buildpacks are actually obviously fantastic as a technology and as a a process piece. There's not much else like them and you know, that's obvious from like Heroku's success and everything. I know Dokku uses that. I don't know that Caprover does, but I haven't, I haven't really run Caprover that much. They, they probably do. Like at this point if you're going to support building from code, it seems silly to try and build your own buildpacks. Cause that's what you will do, eventually. So you might as well use what's there. Anyway, this is like just getting to like my personal opinions at this point, but like, if you think containers are a bad idea in 2022, You're wrong, you should, you should stop. Like you should, you should stop. Think about it. I mean, obviously there's not, um, I got a really great question at an interview once, which is, where are containers a bad idea? That's probably one of the best like recent interview questions I've ever gotten cause I was like, Oh yeah, I mean, like, you can't, it can't be perfect everywhere, right? Nothing's perfect everywhere. So it's like, where is it? Uh, and of course the answer was networking, right? (unintelligible) So if you need absolute performance, but like for just about everything else. Containers are kind of it at this point. Like, time has born it out, I think. So yeah, I always just like bias at taking containers at this point. So I'm probably more of a CapRover person than a Dokku person, even though I have not used, I don't use CapRover. [00:03:09] Jeremy: Well, like something that I've heard with containers, and maybe it's changed recently, but, but something that was kind of holdout was when people would host a database sometimes they would oh we just don't wanna put this in a container and I wonder if like that matches with your thinking or if things have changed. [00:03:27] Victor: I am not a database administrator right like I read postgres docs and I read the, uh, the Postgres documentation, and I think I know a bit about postgres but I don't commit right like so and I also haven't, like, oh, managed X terabytes on one server that you are making sure never goes down kind of deal. But the stickiness for me, at least from when I've run, So I've done a lot of tests with like ZFS and Postgres and like, um, and also like just trying to figure out, and I run Postgres in Kubernetes of course, like on my cluster and a lot of the stuff I found around is, is like fiddly kernel things like sort of base kernel settings that you need to have set. Like, you know, stuff like should you be using transparent huge pages, like stuff like that. But once you have that settled. Containers are just processes with name spacing and resource control, right? Like, that's it. there are some other ins and outs, but for the most part, if you're fine running a process, so people ran processes, right? And they were just completely like unprotected. Then people made users for the processes and they limited the users and ran the processes, right? Then the next step is now you can run a process and then do the limiting the name spaces in cgroups dynamically. Like there, there's, there's sort of not a humongous difference, unless you're hitting something very specific. Uh, but yeah, databases have been a point of contention, but I think, Kelsey Hightower had that tweet yeah. That was like, um, don't run databases in Kubernetes. And I think he called it back. [00:04:56] Victor: I don't know, but I, I know that was uh, was one of those things that people were really unsure about at first, but then after people sort of like felt it out, they were like, Oh, it's actually fine. Yeah. [00:05:06] Jeremy: Yeah I vaguely remember one of the concerns having to do with persistent storage. Like there were challenges with Kubernetes and needing to keep that storage around and I don't know if that's changed yeah or if that's still a concern. [00:05:18] Victor: Uh, I'd say that definitely has changed. Uh, and it was, it was a concern, depending on where you were. Mostly people who are running AKS or EKS or you know, all those other managed Kubernetes, they're just using EBS or like whatever storage provider is like offering for storage. Most of those people don't actually have that much of a problem with, storage in general. Now, high performance storage is obviously different, right? So like, so you'll, you're gonna have to start doing manual, like local volume management and stuff like that. it was a problem, because obviously CSI (Kubernetes Container Storage Interface) didn't exist for some period of time, and like there was, it was hard to know what to do for if you were just running a Kubernetes cluster. I think a lot of people were just using local, first of all, local didn't even exist for a bit. Um, they were just using host path, right? And just like, Oh, it's on the disk somewhere. Where do we, we have to go get it right? Or we have to like, sort of manage that. So that was something most people weren't ready for, especially if you were just, if you weren't like sort of a, a, a traditional sysadmin and used to doing that stuff. And then of course local volumes came out, but I think they still had to be, um, pre-provisioned. So that's sysadmin stuff that most people, you know, maybe aren't, aren't necessarily ready for. Uh, and then most of the general solutions were slow. So like, I used Longhorn (https://longhorn.io) for a long time and Longhorn, Longhorn's great. And super easy to set up, but it can be slower and you can have some, like, delays in mount time. it wasn't ideal for, for most people. So yeah, I, overall it's true. Databases, Databases in Kubernetes were kind of fraught with peril for a while, but it wasn't for the reason that, it wasn't for the fundamental reason that Kubernetes was just wrong or like, it wasn't the reason most people think of, which is just like, Oh, you're gonna break your database. It's more like, running a database is hard and Kubernetes hasn't solved all the hard problems. Like, cuz that's what Kubernetes does. It basically solves a lot of problems in a very generic way. Right. So it just hadn't solved all those problems yet at this point. I think it's got decent answers on a lot of them. So I, I mean, I don't know. I I do it. Don't, don't take what I'm saying to your, you know, PM meeting or your standup meeting, uh, anyone who's listening. But it's more like if you could solve the problems with databases in the sense before. You could probably solve 'em on Kubernetes now with a good understanding of Kubernetes. Cause at the end of the day, it's all the same stuff. Just Kubernetes makes it a little easier to, uh, do it dynamically. [00:07:50] Jeremy: It sounds like you could do it before, but some of the, I guess the tools or the ways of doing persistent storage were not quite there yet, or they were difficult to use. And so that was why people at the start were like, Okay, maybe it's not a good idea, but, now maybe there's some established practices for how you should run a database in Kubernetes. And I, I suppose the other aspect too is that, like you were saying, Kubernetes is its own thing. You gotta learn Kubernetes and all its intricacies. And then running a database is also its own challenge. So if you stack the two of them together and, and the path was not really clear then maybe at the start it wasn't the best idea. Um, uh, if somebody was going to try it out now, was there like a specific resource you looked at or a specific path to where like okay this is is how I'm going to do it. [00:08:55] Victor: I'll just say what I normally recommend to everybody. Cause it depends on which path you wanna go right? If you wanna go down like running a database path first and figure that out, fill out that skill tree. Like go read the Postgres docs. Well, first of all, use Postgres. That's the first tip there. But like, read those documents. And obviously you don't have to understand everything. You won't understand everything. But knowing the big pieces and sort of letting your brain see the mention of like a whole bunch of things, like what is toast? Oh, you can do compression on columns. Like, you can do some, some things concurrently. Um, you know, what ALTER TABLE looks like. You get all that stuff kind of in your head. Um, and then I personally really believe in sort of learning by building and just like iterating. you won't get it right the first time. It's just like, it's not gonna happen. You're get, you can, you can get better the first time, right? By being really prepared and like, and leave yourself lots of outs, but you kind of have to like, get it out there. Do do your best to make sure that you can't fail, uh, catastrophically, right? So this is like, goes back to that decision to like use ZFS as the bottom of this I'm just like, All right, well, I, I'm not a file systems expert, but if I. I could delegate some of that, you know, some of that, I can get some of that knowledge from someone else. Um, and I can make it easier for me to not fail catastrophically. For the database side, actually read documentation on Postgres or the whatever database you're going to use, make sure you at least understand that. Then start running it like locally or whatever. Again, Docker use, use Docker locally. It's, it's, it's fine. and then, you know, sort of graduate to running sort of more progressively, more complicated versions. what I would say for the Kubernetes side is actually similar. the Kubernetes docs are really good. they're very large. but they're good. So you can actually go through and know all the, like, workload, workload resources, know, like what a config map is, what a secret is, right? Like what etcd is doing in this whole situation. you know, what a kublet is versus an API server, right? Like the, the general stuff, like if you go through all that, you should have like a whole bunch of ideas at least floating around in your head. And then once you try and start setting up a server, they will all start to pop up again, right? And they'll all start to like, you, like, Oh, okay, I need a CNI (Container Networking) plugin because something needs to make the services available, right? Or something needs to power the ingress, right? Like, if I wanna be able to get traffic, I need an ingress object. But what listens, what does that, what makes that ingress object do anything? Oh, it's an ingress controller. nginx, you know, almost everyone's heard of nginx, so they're like, okay. Um, nginx, has an ingress control. Actually there's, there used to be two, I assume there's still two, but there's like one that's maintained by Kubernetes, one that's maintained by nginx, the company or whatever. I use traefik, it's fantastic. but yeah, so I think those things kind of fall out and that is almost always my first way to explain it and to start building. And tinkering iteratively. So like, read the documentation, get a good first grasp of it, and then start building yourself because you'll, you'll get way more questions that way. Like, you'll ask way more questions, you won't be able to make progress. Uh, and then of course you can, you know, hop into slacks or like start looking around and, and searching on the internet. oh, one of the things that really helped me out early learning Kubernetes was, Kelsey Hightower's, um, learn Kubernetes the hard way. I'm also a big believer in doing things the hard way, at least knowing what you're choosing to not know, right? distributing file system, Deltas, right? Or like changes to a file system over the network is not a new problem. Other people have solved it. There's a lot of complexity there. but if you at least know the sort of surface level of what the thing does and what it's supposed to do and how it's supposed to do it, you can make a decision on, Oh, how deep am I going to go? Right? To prevent yourself from like, making a mistake or going too deep in the rabbit hole. If you have an idea of the sort of ecosystem and especially like, Oh, here, like the basics of how I can use this thing, that's generally very good. And doing things the hard way is a great way to get a, a feel for that, right? Cause if you take some chunk and like, you know, the first level of doing things the hard way, uh, or, you know, Kelsey Hightower's guide is like, get a machine, right? Like, so, like, if you somehow were like, Oh, I wanna run a Kubernetes cluster. but, you know, I don't want use necessarily EKS and you wanna learn it the hard way. You have to go get a machine, right? If you, if you're not familiar, if you run on Heroku the whole time, like you didn't manage your own machines, you gotta go like, figure out EC2, right? Or, I personally use, hetzner I love hetzner, so you have to go figure out hetzner, digital ocean, whatever. Right. And then the next thing's like, you know, the guide's changed a lot, and I haven't, I haven't looked at it in like, in years, actually a while since I, since I've sort of been, I guess living it, but it's, it's like generate certificates, right? So if you've never dealt with SSL and like, sort of like, or I should say TLS uh, and generating certificates and how that whole dance works, right? Which is fascinating because it's like, oh, right, nothing's secure on the internet, except that we distribute root certificates on computers that are deployed in every OS, right? Like, that's a sort of fundamental understanding you may not go deep enough to realize, but if you are fascinated by it, trying to do it manually would lead you down that path. You'd be like, Oh, what, like what is this thing? What is a CSR? Like, why, who is signing my request? Right? And it's like, why do we trust those people? Right? And it's like, you know, that kind of thing comes out and I feel like you can only get there from trying to do it, you know, answering the questions you can. Right. And again, it takes some judgment to know when you should not go down a rabbit hole. uh, and then iterating. of course there are people who are excellent at explaining. you can find some resources that are shortcuts. But, uh, I think particularly my bread and butter has been just to try and do it the hard way. Avoid pitfalls or like rabbit holes when you can. But know that the rabbit hole is there, and then keep going. And sometimes if something's just too hard, you're not gonna get it the first time. Like maybe you'll have to wait like another three months, you'll try again and you'll know more sort of ambiently about everything else. You get a little further that time. that's how I feel about that. Anyway. [00:15:06] Jeremy: That makes sense to me. I think sometimes when people take on a project, they try to learn too many things at the same time. I, I think the example of Kubernetes and Postgres is pretty good example, where if you're not familiar with how do I install Postgres on bare metal or a vm, trying to make sense of that while you're trying to into is probably gonna be pretty difficult. So, so splitting them up and learning them individually, that makes a lot of sense to me. And the whole deciding how deep you wanna go. That's interesting too, because I think that's very specific to the person right because sometimes you wanna go a little deeper because otherwise you don't understand how the two things connect together. But other times it's just like with the example with certificates, some people they may go like, I just put in let's encrypt it gives me my cert I don't care right then, and then, and some people they wanna know like okay how does the whole certificate infrastructure work which I think is interesting, depending on who you are, maybe you go ahh maybe it doesn't really matter right. [00:16:23] Victor: Yeah, and, you know, shout out to Let's Encrypt . It's, it's amazing, right? think Singlehandedly the most, most of the deployment of HTTPS that happens these days, right? so many so many of like internet providers and uh, sort of service providers will use it right? Under the covers. Like, Hey, we've got you free SSL through Let's Encrypt, right? Like, kind of like under the, under the covers. which is awesome. And they, and they do it. So if you're listening to this, donate to them. I've done it. So now that, now the pressure is on whoever's listening, but yeah, and, and I, I wanna say I am that person as well, right? Like, I use, Cert Manager on my cluster, right? So I'm just like, I don't wanna think about it, but I, you know, but I, I feel like I thought about it one time. I have a decent grasp. If something changes, then I guess I have to dive back in. I think it, you've heard the, um, innovation tokens idea, right? I can't remember the site. It's like, um, do, like do boring tech or something.com (https://boringtechnology.club/) . Like it shows up on sort of hacker news from time to time, essentially. But it's like, you know, you have a certain amount of tokens and sort of, uh, we'll call them tokens, but tolerance for complexity or tolerance for new, new ideas or new ways of doing things, new processes. Uh, and you spend those as you build any project, right? you can be devastatingly effective by just sticking to the stack, you know, and not introducing anything new, even if it's bad, right? and there's nothing wrong with LAMP stack, I don't wanna annoy anybody, but like if you, if you're running LAMP or if you run on a hostgator, right? Like, if you run on so, you know, some, some service that's really old but really works for you isn't, you know, too terribly insecure or like, has the features you need, don't learn Kubernetes then, right? Especially if you wanna go fast. cuz you, you're spending tokens, right? You're spending, essentially brain power, right? On learning whatever other thing. So, but yeah, like going back to that, databases versus databases on Kubernetes thing, you should probably know one of those before you, like, if you're gonna do that, do that thing. You either know Kubernetes and you like, at least feel comfortable, you know, knowing Kubernetes extremely difficult obviously, but you feel comfortable and you feel like you can debug. Little bit of a tangent, but maybe that's even a better, sort of watermark if you know how to debug a thing. If, if it's gone wrong, maybe one or five or 10 or 20 times and you've gotten out. Not without documentation, of course, cuz well, if you did, you're superhuman. But, um, but you've been able to sort of feel your way out, right? Like, Oh, this has gone wrong and you have enough of a model of the system in your head to be like, these are the three places that maybe have something wrong with them. Uh, and then like, oh, and then of course it's just like, you know, a mad dash to kind of like, find, find the thing that's wrong. You should have confidence about probably one of those things before you try and do both when it's like, you know, complex things like databases and distributed systems management, uh, and orchestration. [00:19:18] Jeremy: That's, that's so true in, in terms of you are comfortable enough being able to debug a problem because it's, I think when you are learning about something, a lot of times you start with some kind of guide or some kind of tutorial and you follow the steps. And if it all works, then great. Right? But I think it's such a large leap from that to something went wrong and I have to figure it out. Right. Whether it's something's not right in my Dockerfile or my postgres instance uh, the queries are timing out. so many things that could go wrong, that is the moment where you're forced to figure out, okay, what do I really know about this not thing? [00:20:10] Victor: Exactly. Yeah. Like the, the rubber's hitting the road it's uh you know the car's about to crash or has already crashed like if I open the bonnet, do I know what's happening right or am I just looking at (unintelligible). And that's, it's, I feel sort a little sorry or sad for, for devs that start today because there's so much. Complexity that's been built up. And a lot of it has a point, but you need to kind of have seen the before to understand the point, right? So I like, I like to use front end as an example, right? Like the front end ecosystem is crazy, and it has been crazy for a very long time, but the steps are actually usually logical, right? Like, so like you start with, you know, HTML, CSS and JavaScript, just plain, right? And like, and you can actually go in lots of directions. Like HTML has its own thing. CSS has its own sort of evolution sort of thing. But if we look at JavaScript, you're like, you're just writing JavaScript on every page, right? And like, just like putting in script tags and putting in whatever, and it's, you get spaghetti, you get spaghetti, you start like writing, copying the same function on multiple pages, right? You just, it, it's not good. So then people, people make jquery, right? And now, now you've got like a, a bundled set of like good, good defaults that you can, you can go for, right? And then like, you know, libraries like underscore come out for like, sort of like not dom related stuff that you do want, you do want everywhere. and then people go from there and they go to like backbone or whatever. it's because Jquery sort of also becomes spaghetti at some point and it becomes hard to manage and people are like, Okay, we need to sort of like encapsulate this stuff somehow, right? And like the new tools or whatever is around at the same timeframe. And you, you, you like backbone views for example. and you have people who are kind of like, ah, but that's not really good. It's getting kind of slow. Uh, and then you have, MVC stuff comes out, right? Like Angular comes out and it's like, okay, we're, we're gonna do this thing called dirty checking, and it's gonna be, it's gonna be faster and it's gonna be like, it's gonna be less sort of spaghetti and it's like a little bit more structured. And now you have sort of like the rails paradigm, but on the front end, and it takes people to get a while to get adjusted to that, but then that gets too heavy, right? And then dirty checking is realized to be a mistake. And then, you get stuff like MVVM, right? So you get knockout, like knockout js and you got like Durandal, and like some, some other like sort of front end technologies that come up to address that problem. Uh, and then after that, like, you know, it just keeps going, right? Like, and if you come in at the very end, you're just like, What is happening? Right? Like if it, if it, if someone doesn't sort of boil down the complexity and reduce it a little bit, you, you're just like, why, why do we do this like this? Right? and sometimes there's no good reason. Sometimes the complexity is just like, is unnecessary, but having the steps helps you explain it, uh, or helps you understand how you got there. and, and so I feel like that is something younger people or, or newer devs don't necessarily get a chance to see. Cause it just, it would take, it would take very long right? And if you're like a new dev, let's say you jumped into like a coding bootcamp. I mean, I've got opinions on coding boot camps, but you know, it's just like, let's say you jumped into one and you, you came out, you, you made it. It's just, there's too much to know. sure, you could probably do like HTML in one month. Well, okay, let's say like two weeks or whatever, right? If you were, if you're literally brand new, two weeks of like concerted effort almost, you know, class level, you know, work days right on, on html, you're probably decently comfortable with it. Very comfortable. CSS, a little harder because this is where things get hard. Cause if you, if you give two weeks for, for HTML, CSS is harder than HTML kind of, right? Because the interactions are way more varied. Right? Like, and, and maybe it's one of those things where you just, like, you, you get somewhat comfortable and then just like know that in the future you're gonna see something you don't understand and have to figure it out. Uh, but then JavaScript, like, how many months do you give JavaScript? Because if you go through that first like, sort of progression that I, I I, I, I mentioned everyone would have a perfect sort of, not perfect but good understanding of the pieces, right? Like, why did we start transpiling at all? Right? Like, uh, or why did you know, why did we adopt libraries? Like why did Bower exist? No one talks about Bower anymore, obviously, but like, Bower was like a way to distribute front end only packages, right? Um, what is it? Um, Uh, yes, there's grunt. There's like the whole build system thing, right? Once, once we decide we're gonna, we're gonna do stuff to files before we, before we push. So there's grunt, there's, uh, gulp, which is like grunt, but like, Oh, we're gonna do it all in memory. We're gonna pipe, we're gonna use this pipes thing to make sure everything goes fast. then there's like, of course that leads like the insanity that's webpack. And then there's like parcel, which did better. There's vite there's like, there's all this, there's this progression, but how many months would it take to know that progression? It, it's too long. So they end up just like, Hey, you're gonna learn react. Which is the right thing because it's like, that's what people hire for, right? But then you're gonna be in react and be like, What's webpack, right? And it's like, but you can't go down. You can't, you don't have the time. You, you can't sort of approach that problem from the other direction where you, which would give you better understanding cause you just don't have the time. I think it's hard for newer devs to overcome this. Um, but I think there are some, there's some hope on the horizon cuz some things are simpler, right? Like some projects do reduce complexity, like, by watching another project sort of innovate so like react. Wasn't the first component, first framework, right? Like technically, I, I think, I think you, you might have to give that to like, to maybe backbone because like they had views and like marionette also went with that. Like maybe, I don't know, someone, someone I'm sure will get in like, send me an angry email, uh, cuz I forgot you Moo tools or like, you know, Ember Ember. They've also, they've also been around, I used to be a huge Ember fan, still, still kind of am, but I don't use it. but if you have these, if you have these tools, right? Like people aren't gonna know how to use them and Vue was able to realize that React had some inefficiencies, right? So React innovates the sort of component. So Reintroduces the component based model component first, uh, front end development model. Vue sees that and it's like, wait a second, if we just export this like data object, and of course that's not the only innovation of Vue, but if we just export this data object, you don't have to do this fine grained tracking yourself anymore, right? You don't have to tell React or tell your the system which things change when other things change, right? Like you, you don't have to set up this watching and stuff, right? Um, and that's one of the reasons, like Vue is just, I, I, I remember picking up Vue and being like, Oh, I'm done. I'm done with React now. Because it just doesn't make sense to use React because they Vue essentially either, you know, you could just say they learned from them or they, they realize a better way to do things that is simpler and it's much easier to write. Uh, and you know, functionally similar, right? Um, similar enough that it's just like, oh they boil down some of that complexity and we're a step forward and, you know, in other ways, I think. Uh, so that's, that's awesome. Every once in a while you get like a compression in the complexity and then it starts to ramp up again and you get maybe another compression. So like joining the projects that do a compression. Or like starting to adopting those is really, can be really awesome. So there's, there's like, there's some hope, right? Cause sometimes there is a compression in that complexity and you you might be lucky enough to, to use that instead of, the thing that's really complex after years of building on it. [00:27:53] Jeremy: I think you're talking about newer developers having a tough time making sense of the current frameworks but the example you gave of somebody starting from HTML and JavaScript going to jquery backbone through the whole chain, that that's just by nature of you've put in a lot of time right you've done a lot of work working with each of these technologies you see the progression as if someone is starting new just by nature of you being new you won't have been able to spend that time [00:28:28] Victor: Do you think it could work? again, the, the, the time aspect is like really hard to get like how can you just avoid spending time um to to learn things that's like a general problem I think that problem is called education in the general sense. But like, does it make sense for a, let's say a bootcamp or, or any, you know, school right? To attempt to guide people through the previous solutions that didn't work, right? Like in math, you don't start with calculus, right? It just wouldn't, it doesn't make sense, right? But we try and start with calculus in software, right? We're just like, okay, here's the complexity. You've got all of it. Don't worry. Just look at this little bit. If, you know, if the compiler ever spits out a weird error uh oh, like, you're, you're, you're in for trouble cuz you, you just didn't get the. get the basics. And I think that's maybe some of what is missing. And the thing is, it is like the constraints are hard, right? No one has infinite time, right? Or like, you know, even like, just tons of time to devote to learning, learning just front end, right? That's not even all of computing, That's not even the algorithm stuff that some companies love to throw at you, right? Uh, or the computer sciencey stuff. I wonder if it makes more sense to spend some time taking people through the progression, right? Because discovering that we should do things via components, let's say, or, or at least encapsulate our functionality to components and compose that way, is something we, we not everyone knew, right? Or, you know, we didn't know wild widely. And so it feels like it might make sense to touch on that sort of realization and sort of guide the student through, you know, maybe it's like make five projects in a week and you just get progressively more complex. But then again, that's also hard cause effort, right? It's just like, it's a hard problem. But, but I think right now, uh, people who come in at the end and sort of like see a bunch of complexity and just don't know why it's there, right? Like, if you've like, sort of like, this is, this applies also very, this applies to general, but it applies very well to the Kubernetes problem as well. Like if you've never managed nginx on more than one machine, or if you've never tried to set up a, like a, to format your file system on the machine you just rented because it just, you know, comes with nothing, right? Or like, maybe, maybe some stuff was installed, but, you know, if you had to like install LVM (Logical Volume Manager) yourself, if you've never done any of that, Kubernetes would be harder to understand. It's just like, it's gonna be hard to understand. overlay networks are hard for everyone to understand, uh, except for network people who like really know networking stuff. I think it would be better. But unfortunately, it takes a lot of time for people to take a sort of more iterative approach to, to learning. I try and write blog posts in this way sometimes, but it's really hard. And so like, I'll often have like an idea, like, so I call these, or I think of these as like onion, onion style posts, right? Where you either build up an onion sort of from the inside and kind of like go out and like add more and more layers or whatever. Or you can, you can go from the outside and sort of take off like layers. Like, oh, uh, Kubernetes has a scheduler. Why do they need a scheduler? Like, and like, you know, kind of like, go, go down. but I think that might be one of the best ways to learn, but it just takes time. Or geniuses and geniuses who are good at two things, right? Good at the actual technology and good at teaching. Cuz teaching is a skill and it's very hard. and, you know, shout out to teachers cuz that's, it's, it's very difficult, extremely frustrating. it's hard to find determinism in, in like methods and solutions. And there's research of course, but it's like, yeah, that's, that's a lot harder than the computer being like, Nope, that doesn't work. Right? Like, if you can't, if you can't, like if you, if the function call doesn't work, it doesn't work. Right. If the person learned suboptimally, you won't know Right. Until like 10 years down the road when, when they can't answer some question or like, you know, when they, they don't understand. It's a missing fundamental piece anyway. [00:32:24] Jeremy: I think with the example of front end, maybe you don't have time to walk through the whole history of every single library and framework that came but I think at the very least, if you show someone, or you teach someone how to work with css, and you have them, like you were talking about components before you have them build a site where there's a lot of stuff that gets reused, right? Maybe you have five pages and they all have the same nav bar. [00:33:02] Victor: Yeah, you kind of like make them do it. [00:33:04] Jeremy: Yeah. You make 'em do it and they make all the HTML files, they copy and paste it, and probably your students are thinking like, ah, this, this kind of sucks [00:33:16] Victor: Yeah [00:33:18] Jeremy: And yeah, so then you, you come to that realization, and then after you've done that, then you can bring in, okay, this is why we have components. And similarly you brought up, manual dom manipulation with jQuery and things like that. I, I'm sure you could come up with an example of you don't even necessarily need to use jQuery. I think people can probably skip that step and just use the the, the API that comes with the browser. But you can have them go in like, Oh, you gotta find this element by the id and you gotta change this based on this, and let them experience the. I don't know if I would call it pain, but let them experience like how it was. Right. And, and give them a complex enough task where they feel like something is wrong right. Or, or like, there, should be something better. And then you can go to you could go straight to vue or react. I'm not sure if we need to go like, Here's backbone, here's knockout. [00:34:22] Victor: Yeah. That's like historical. Interesting. [00:34:27] Jeremy: I, I think that would be an interesting college course or something that. Like, I remember when, I went through school, one of the classes was programming languages. So we would learn things like, Fortran and stuff like that. And I, I think for a more frontend centered or modern equivalent you could go through, Hey, here's the history of frontend development here's what we used to do and here's how we got to where we are today. I think that could be actually a pretty interesting class yeah [00:35:10] Victor: I'm a bit interested to know you learned fortran in your PL class. I, think when I went, I was like, lisp and then some, some other, like, higher classes taught haskell but, um, but I wasn't ready for haskell, not many people but fortran is interesting, I kinda wanna hear about that. [00:35:25] Jeremy: I think it was more in terms of just getting you exposed to historically this is how things were. Right. And it wasn't so much of like, You can take strategies you used in Fortran into programming as a whole. I think it was just more of like a, a survey of like, Hey, here's, you know, here's Fortran and like you were saying, here's Lisp and all, all these different languages nd like at least you, you get to see them and go like, yeah, this is kind of a pain. [00:35:54] Victor: Yeah [00:35:55] Jeremy: And like, I understand why people don't choose to use this anymore but I couldn't take away like a broad like, Oh, I, I really wish we had this feature from, I think we were, I think we were using Fortran 77 or something like that. I think there's Fortran 77, a Fortran 90, and then there's, um, I think, [00:36:16] Victor: Like old fortran, deprecated [00:36:18] Jeremy: Yeah, yeah, yeah. So, so I think, I think, uh, I actually don't know if they're, they're continuing to, um, you know, add new things or maintain it or it's just static. But, it's, it's more, uh, interesting in terms of, like we were talking front end where it's, as somebody who's learning frontend development who is new and you get to see how, backbone worked or how Knockout worked how grunt and gulp worked. It, it's like the kind of thing where it's like, Oh, okay, like, this is interesting, but let us not use this again. Right? [00:36:53] Victor: Yeah. Yeah. Right. But I also don't need this, and I will never again [00:36:58] Jeremy: yeah, yeah. It's, um, but you do definitely see the, the parallels, right? Like you were saying where you had your, your Bower and now you have NPM and you had Grunt and Gulp and now you have many choices [00:37:14] Victor: Yeah. [00:37:15] Jeremy: yeah. I, I think having he history context, you know, it's interesting and it can be helpful, but if somebody was. Came to me and said hey I want to learn how to build websites. I get into front end development. I would not be like, Okay, first you gotta start moo tools or GWT. I don't think I would do that but it I think at a academic level or just in terms of seeing how things became the way they are sure, for sure it's interesting. [00:37:59] Victor: Yeah. And I, I, think another thing I don't remember who asked or why, why I had to think of this lately. um but it was, knowing the differentiators between other technologies is also extremely helpful right? So, What's the difference between ES build and SWC, right? Again, we're, we're, we're leaning heavy front end, but you know, just like these, uh, sorry for context, of course, it's not everyone a front end developer, but these are two different, uh, build tools, right? For, for JavaScript, right? Essentially you can think of 'em as transpilers, but they, I think, you know, I think they also bundle like, uh, generally I'm not exactly sure if, if ESbuild will bundle as well. Um, but it's like one is written in go, the other one's written in Rust, right? And sort of there's, um, there's, in addition, there's vite which is like vite does bundle and vite does a lot of things. Like, like there's a lot of innovation in vite that has to have to do with like, making local development as fast as possible and also getting like, you're sort of making sure as many things as possible are strippable, right? Or, or, or tree shakeable. Sorry, is is is the better, is the better term. Um, but yeah, knowing, knowing the, um, the differences between projects is often enough to sort of make it less confusing for me. Um, as far as like, Oh, which one of these things should I use? You know, outside of just going with what people are recommending. Cause generally there is some people with wisdom sometimes lead the crowd sometimes, right? So, so sometimes it's okay to be, you know, a crowd member as long as you're listening to the, to, to someone worth listening to. Um, and, and so yeah, I, I think that's another thing that is like the mark of a good project or, or it's not exclusive, right? It's not, the condition's not necessarily sufficient, but it's like a good projects have the why use this versus x right section in the Readme, right? They're like, Hey, we know you could use Y but here's why you should use us instead. Or we know you could use X, but here's what we do better than X. That might, you might care about, right? That's, um, a, a really strong indicator of a project. That's good cuz that means the person who's writing the project is like, they've done this, the survey. And like, this is kind of like, um, how good research happens, right? It's like most of research is reading what's happening, right? To knowing, knowing the boundary you're about to push, right? Or try and sort of like push one, make one step forward in, um, so that's something that I think the, the rigor isn't in necessarily software development everywhere, right? Which is good and bad. but someone who's sort of done that sort of rigor or, and like, and, and has, and or I should say, has been rigorous about knowing the boundary, and then they can explain that to you. They can be like, Oh, here's where the boundary was. These people were doing this, these people were doing this, these people were doing this, but I wanna do this. So you just learned now whether it's right for you and sort of the other points in the space, which is awesome. Yeah. Going to your point, I feel like that's, that's also important, it's probably not a good idea to try and get everyone to go through historical artifacts, but if just a, a quick explainer and sort of, uh, note on the differentiation, Could help for sure. Yeah. I feel like we've skewed too much frontend. No, no more frontend discussion this point. [00:41:20] Jeremy: It's just like, I, I think there's so many more choices where the, the mental thought that has to go into, Okay, what do I use next I feel is bigger on frontend. I guess it depends on the project you're working on but if you're going to work on anything front end if you haven't done it before or you don't have a lot of experience there's so many build tools so many frameworks, so many libraries that yeah, but we [00:41:51] Victor: Iterate yeah, in every direction, like the, it's good and bad, but frontend just goes in every direction at the same time Like, there's so many people who are so enthusiastic and so committed and and it's so approachable that like everyone just goes in every direction at the same time and like a lot of people make progress and then unfortunately you have try and pick which, which branch makes sense. [00:42:20] Jeremy: We've been kind of talking about, some of your experiences with a few things and I wonder if you could explain the the context you're thinking of in terms of the types of projects you typically work on like what are they what's the scale of them that sort of thing. [00:42:32] Victor: So I guess I've, I've gone through a lot of phases, right? In sort of what I use in in my tooling and what I thought was cool. I wrote enterprise java like everybody else. Like, like it really doesn't talk about it, but like, it's like almost at some point it was like, you're either a rail shop or a Java shop, for so many people. And I wrote enterprise Java for a, a long time, and I was lucky enough to have friends who were really into, other kinds of computing and other kinds of programming. a lot of my projects were wrapped around, were, were ideas that I was expressing via some new technology, let's say. Right? So, I wrote a lot of haskell for, for, for a while, right? But what did I end up building with that was actually a job board that honestly didn't go very far because I was spending much more time sort of doing, haskell things, right? And so I learned a lot about sort of what I think is like the pinnacle of sort of like type development in, in the non-research world, right? Like, like right on the edge of research and actual usability. But a lot of my ideas, sort of getting back to the, the ideas question are just things I want to build for myself. Um, or things I think could be commercially viable or like do, like, be, be well used, uh, and, and sort of, and profitable things, things that I think should be built. Or like if, if I see some, some projects as like, Oh, I wish they were doing this in this way, Right? Like, I, I often consider like, Oh, I want, I think I could build something that would be separate and maybe do like, inspired from other projects, I should say, Right? Um, and sort of making me understand a sort of a different, a different ecosystem. but a lot of times I have to say like, the stuff I build is mostly to scratch an itch I have. Um, and or something I think would be profitable or utilizing technology that I've seen that I don't think anyone's done in the same way. Right? So like learning Kubernetes for example, or like investing the time to learn Kubernetes opened up an entire world of sort of like infrastructure ideas, right? Because like the leverage you get is so high, right? So you're just like, Oh, I could run an aws, right? Like now that I, now that I know this cuz it's like, it's actually not bad, it's kind of usable. Like, couldn't I do that? Right? That kind of thing. Right? Or um, I feel like a lot of the times I'll learn a technology and it'll, it'll make me feel like certain things are possible that they, that weren't before. Uh, like Rust is another one of those, right? Like, cuz like Rust will go from like embedded all the way to WASM, which is like a crazy vertical stack. Right? It's, that's a lot, That's a wide range of computing that you can, you can touch, right? And, and there's, it's, it's hard to learn, right? The, the, the, the, uh, the, the ramp to learning it is quite steep, but, it opens up a lot of things you can write, right? It, it opens up a lot of areas you can go into, right? Like, if you ever had an idea for like a desktop app, right? You could actually write it in Rust. There's like, there's, there's ways, there's like is and there's like, um, Tauri is one of my personal favorites, which uses web technology, but it's either I'm inspired by some technology and I'm just like, Oh, what can I use this on? And like, what would this really be good at doing? or it's, you know, it's one of those other things, like either I think it's gonna be, Oh, this would be cool to build and it would be profitable. Uh, or like, I'm scratching my own itch. Yeah. I think, I think those are basically the three sources. [00:46:10] Jeremy: It's, it's interesting about Rust where it seems so trendy, I guess, in lots of people wanna do something with rust, but then in a lot of they also are not sure does it make sense to write in rust? Um, I, I think the, the embedded stuff, of course, that makes a lot of sense. And, uh, you, you've seen a sort of surge in command line apps, stuff ripgrep and ag, stuff like that, and places like that. It's, I think the benefits are pretty clear in terms of you've got the performance and you have the strong typing and whatnot and I think where there's sort of the inbetween section that's kind of unclear to me at least would I build a web application in rust I'm not sure that sort of thing [00:47:12] Victor: Yeah. I would, I characterize it as kind of like, it's a tool toolkit, so it really depends on the problem. And think we have many tools that there's no, almost never a real reason to pick one in particular right? Like there's, Cause it seems like just most of, a lot of the work, like, unless you're, you're really doing something interesting, right? Like, uh, something that like, oh, I need to, I need to, like, I'm gonna run, you know, billions and billions of processes. Like, yeah, maybe you want erlang at that point, right? Like, maybe, maybe you should, that should be, you know, your, your thing. Um, but computers are so fast these days, and most languages have, have sort of borrowed, not borrowed, but like adopted features from others that there's, it's really hard to find a, a specific use case, for one particular tool. Uh, so I often just categorize it by what I want out of the project, right? Or like, either my goals or project goals, right? Depending on, and, or like business goals, if you're, you know, doing this for a business, right? Um, so like, uh, I, I basically, if I want to go fast and I want to like, you know, reduce time to market, I use type script, right? Oh, and also I'm a, I'm a, like a type zealot. I, I'd say so. Like, I don't believe in not having types, right? Like, it's just like there's, I think it's crazy that you would like have a function but not know what the inputs could be. And they could actually be anything, right? , you're just like, and then you have to kind of just keep that in your head. I think that's silly. Now that we have good, we, we have, uh, ways to avoid the, uh, ceremony, right? You've got like hindley Milner type systems, like you have a way to avoid the, you can, you know, predict what types of things will be, and you can, you don't have to write everything everywhere. So like, it's not that. But anyway, so if I wanna go fast, the, the point is that going back to that early, like the JS ecosystem goes everywhere at the same time. Typescript is excellent because the ecosystem goes everywhere at the same time. And so you've got really good ecosystem support for just about everything you could do. Um, uh, you could write TypeScript that's very loose on the types and go even faster, but in general it's not very hard. There's not too much ceremony and just like, you know, putting some stuff that shows you what you're using and like, you know, the objects you're working with. and then generally if I wanna like, get it really right, I I'll like reach for haskell, right? Cause it's just like the sort of contortions, and again, this takes time, this not fast, but, right. the contortions you can do in the type system will make it really hard to write incorrect code or code that doesn't, that isn't logical with itself. Of course interfacing with the outside world. Like if you do a web request, it's gonna fail sometimes, right? Like the network might be down, right? So you have to, you basically pull that, you sort of wrap that uncertainty in your system to whatever degree you're okay with. And then, but I know it'll be correct, right? But and correctness is just not important. Most of like, Oh, I should , that's a bad quote. Uh, it's not that correct is not important. It's like if you need to get to market, you do not necessarily need every single piece of your code to be correct, Right? If someone calls some, some function with like, negative one and it's not an important, it's not tied to money or it's like, you know, whatever, then maybe it's fine. They just see an error and then like you get an error in your back and you're like, Oh, I better fix that. Right? Um, and then generally if I want to be correct and fast, I choose rust these days. Right? Um, these days. and going back to your point, a lot of times that means that I'm going to write in Typescript for a lot of projects. So that's what I'll do for a lot of projects is cuz I'll just be like, ah, do I need like absolute correctness or like some really, you know, fancy sort of type stuff. No. So I don't pick haskell. Right. And it's like, do I need to be like mega fast? No, probably not. Cuz like, cuz so I don't necessarily don't necessarily need rust. Um, maybe it's interesting to me in terms of like a long, long term thing, right? Like if I, if I'm think, oh, but I want x like for example, tight, tight, uh, integration with WASM, for example, if I'm just like, oh, I could see myself like, but that's more of like, you know, for a fun thing that I'm doing, right? Like, it's just like, it's, it's, you don't need it. You don't, that's premature, like, you know, that's a premature optimization thing. But if I'm just like, ah, I really want the ability to like maybe consider refactoring some of this out into like a WebAssembly thing later, then I'm like, Okay, maybe, maybe I'll, I'll pick Rust. Or like, if I, if I like, I do want, you know, really, really fast, then I'll like, then I'll go Rust. But most of the time it's just like, I want a good ecosystem so I don't have to build stuff myself most of the time. Uh, and you know, type script is good enough. So my stack ends up being a lot of the time just in type script, right? Yeah. [00:52:05] Jeremy: Yeah, I think you've encapsulated the reason why there's so many packages on NPM and why there's so much usage of JavaScript and TypeScript in general is that it, it, it fits the, it's good enough. Right? And in terms of, in terms of speed, like you said, most of the time you don't need of rust. Um, and so typescript I think is a lot more approachable a lot of people have to use it because they do front end work anyways. And so that kinda just becomes the I don't know if I should say the default but I would say it's probably the most common in terms of when somebody's building a backend today certainly there's other languages but JavaScript and TypeScript is everywhere. [00:52:57] Victor: Yeah. Uh, I, I, I, another thing is like, I mean, I'm, of ignored the, like, unreasonable effectiveness of like rails Cause there's just a, there's tons of just like rails warriors out there, and that's great. They're they're fantastic. I'm not a, I'm not personally a huge fan of rails but that's, uh, that's to my own detriment, right? In, in some, in some ways. But like, Rails and Django sort of just like, people who, like, I'm gonna learn this framework it's gonna be excellent. It most, they have a, they have carved out a great ecosystem for themselves. Um, or like, you know, even php right? PHP and like Laravel, or whatever. Uh, and so I'm ignoring those, like, those pockets of productivity, right? Those pockets of like intense productivity that people like, have all their needs met in that same way. Um, but as far as like general, general sort of ecosystem size and speed for me, um, like what you said, like applies to me. Like if I, if I'm just like, especially if I'm just like, Oh, I just wanna build a backend, Like, I wanna build something that's like super small and just does like, you know, maybe a few, a couple, you know, endpoints or whatever and just, I just wanna throw it out there. Right? Uh, I, I will pick, yeah. Typescript. It just like, it makes sense to me. I also think note is a better. VM or platform to build on than any of the others as well. So like, like I, by any of the others, I mean, Python, Perl, Ruby, right? Like sort of in the same class of, of tool. So I I am kind of convinced that, um, Node is better, than those as far as core abilities, right? Like threading Right. Versus the just multi-processing and like, you know, other, other, other solutions and like, stuff like that. So, if you want a boring stack, if I don't wanna use any tokens, right? Any innovation tokens I reach for TypeScript. [00:54:46] Jeremy: I think it's good that you brought up. Rails and, and Django because, uh, personally I've done, I've done work with Rails, and you're right in that Rails has so many built in, and the ways to do them are so well established that your ability to be productive and build something really fast hard to compete with, at least in my experience with available in the Node ecosystem. Um, on the other hand, like I, I also see what you mean by the runtimes. Like with Node, you're, you're built on top of V8 and there's so many resources being poured into it to making it fast and making it run pretty much everywhere. I think you probably don't do too much work with managed services, but if you go to a managed service to run your code, like a platform as a service, they're gonna support Node. Will they support your other preferred language? Maybe, maybe not, You know that they will, they'll be able to run node apps so but yeah I don't know if it will ever happen or maybe I'm just not familiar with it, but feel like there isn't a real rails of javascript. [00:56:14] Victor: Yeah, you're, totally right. There are, there are. It's, it's weird. It's actually weird that there, like Uh, but, but, I kind of agree with you. There's projects that are trying it recently. There's like Adonis, um, there is, there are backends that also do, like, will do basic templating, like Nest, NestJS is like really excellent. It's like one of the best sort of backend, projects out there. I I, I but like back in the day, there were projects like Sails, which was like very much trying to do exactly what Rails did, but it just didn't seem to take off and reach that critical mass possibly because of the size of the ecosystem, right? Like, how many alternatives to Rails are there? Not many, right? And, and now, anyway, maybe let's say the rest of 'em sort of like died out over the years, but there's also like, um, hapi HAPI, uh, which is like also, you know, similarly, it was like angling themselves to be that, but they just never, they never found the traction they needed. I think, um, or at least to be as wide, widely known as Rails is for, for, for the, for the Ruby ecosystem, um, but also for people to kind of know the magic, cause. Like I feel like you're productive in Rails only when you imbibe the magic, right? You, you, know all the magic context and you know the incantations and they're comforting to you, right? Like you've, you've, you have the, you have the sort of like, uh, convention. You're like, if you're living and breathing the convention, everything's amazing, right? Like, like you can't beat that. You're just like, you're in the zone but you need people to get in that zone. And I don't think node has, people are just too, they're too frazzled. They're going like, there's too much options. They can't, it's hard to commit, right? Like, imagine if you'd committed to backbone. Like you got, you can't, It's, it's over. Oh, it's not over. I mean, I don't, no, I don't wanna, you know, disparage the backbone project. I don't use it, but, you know, maybe they're still doing stuff and you know, I'm sure people are still working on it, but you can't, you, it's hard to commit and sort of really imbibe that sort of convention or, or, or sort of like, make yourself sort of breathe that product when there's like 10 products that are kind of similar and could be useful as well. Yeah, I think that's, that's that's kind of big. It's weird that there isn't a rails, for NodeJS, but, but people are working on it obviously. Like I mentioned Adonis, there's, there's more. I'm leaving a bunch of them out, but that's part of the problem. [00:58:52] Jeremy: On, on one hand, it's really cool that people are trying so many different things because hopefully maybe they can find something that like other people wouldn't have thought of if they all stick same framework. but on the other hand, it's ... how much time have we spent jumping between all these different frameworks when what we could have if we had a rails. [00:59:23] Victor: Yeah the, the sort of wasted time is, is crazy to think about it uh, I do think about that from time to time. And you know, and personally I waste a lot of my own time. Like, just, just recently, uh, something I've working on, for a long time. I came back to it after just sort of leaving it on the shelf for a while and I was like, You know what? I should rewrite this in rust. I, I really should. and so I talked myself into it, and I'm like, You know what? It's gonna be so much easier to deploy. I'm just gonna have one binary. I'm not gonna have to deal with anything else. I'm just like, it'll be, it'll be so much better. I'll, I'll be a lot more confident in the code I write. And then sort of going through it and like finishing this a, a chunk of it and the kind of project it is, is like I'll have a lot of sort of, different services, right? That, that, that sort of do a similar thing, but a sort of different flavor of a, of a thing, if that makes sense. And I know that I can just go back to typescript on the second one, right? Like, I'm, I'm doing one and I'm just like, and that's what I've decided to do. Cause I'm just like, Yeah, no, this doesn't make any sense. like, I'm spending way too much time, um, when the other thing is like, is good enough. and like, I think maybe just if you feel that, if you can, like, don't know if you stay, stay aware of just like, Oh, how much friction am I encountering and maybe I should switch. Like if you know rails and you know, typescript, you should probably use Rails, if you're bought into the magic of Rails, right? And, and of course Rails is also another thing that has always has great support from, Platforms as service companies. Rails is always gonna be, you know, have great support right, Because it's just one of those places where it's so nice and cozy that, you know, people who use it are just like, the people who don't want to think about the server underneath. [01:01:03] Jeremy: I think that combination is really powerful. Like you were talking earlier about working with Kubernetes and learning how all that works and how to run a database and all that. And if you think about the Heroku experience, right? You create your, your Rails app. You tell Heroku I want a database and then you push it. you don't have to worry about pods or Docker or any of that. They take care of all of it for you so I think that certainly there's a value to going deeper and, and learning about how to host things yourself and things like that but I can totally understand if you have the money, uh, especially if for a business would say I don't wanna do this type of ops work I don't want to learn how to set up a cluster just want to push it to a heroku and be done with it. [01:02:00] Victor: Yeah, You don't, no one gives you an award for learning how to, like wrangle LVM right? No no gives you that. They just like, you know you either make it to market or you don't. Uh, and it's like, uh, like I, mean, I'd love to hear about what you sort optimize but I feel like all, it's all about what you want to optimize for. Like, are you optimizing for time to market? Are you optimizing for, a code base that people won't be able to mess up later? Right? Like a lot of just, you know, seed stage startups or like just early startups or big companies, like, it doesn't matter. We'll rewrite anyway. Right? That like the eBay example was a great, was a great sort of indication of that like it will get rewritten. So maybe it doesn't make sense. Maybe it's silly to, to optimize for strong code base the beginning. Um, [01:02:45] Jeremy: I think it, uh, at the beginning, especially if you don't have an established audience, like you're not getting any money, then pick something that the team knows and that, you know, um, or at least the majority does, because that, I think, makes the biggest difference in speed. Speed. Because let's, let's say you, you were giving an example of I would use haskell if I need to be correct, and I would use rust if I need to be fast. but if you are picking something everybody knows and you don't have some very specific requirement, for the most part, if you're using something you already know, it's going to be built faster, it's going to be easier to read and maintain and it'll probably be more correct just because you're more familiar with that whole... [01:03:50] Victor: So I, I agreed right up until the last point I feel like correctness is one of those, if you use a tool that lets you be too sloppy you can't stop people from being sloppy Right? Uh, like I think, and this is actually something I was thinking of earlier today, is like, I think writing good code is either people being disciplined or better systems, and of course it doesn't matter in every case, Right. and so like, so in cases where like, it, it's just not that important and, and it's better to just let it error and then someone just goes and like, fixes it, right? But if you do that too long, you get you can get spaghetti, right? You can get either spaghetti or you can get a code base that's suffering from a lot of technical debt. Uh, and it, it won't be a problem early on, but when it is, it's a big problem, right? and can drain a lot of, a lot of time. but 99% of the time, I agree. You don't need anything other than like TypeScript or Rails or like Django, or you could, you could use perl if you want php obviously, like, you know, Right? Like, you, you could get very far, very fast with those. And often it's like, not even necessary to go anywhere else. But the only little thing I'd say is just like, I find that it's, It's so hard to be correct if you're not getting any help from your compiler, right? Like, for me, at the very least, right? Like, if you're not getting any help from the language, it's so hard to like, write stuff. That's correct. That doesn't ship with bugs in it. Right? There was, um, there's a whole period of time where everyone was getting really excited about writing tests that were like, Oh, make sure to like, write a test with negative one. Right? Like, just like, you know, like the next level test stuff was just like, Oh, but what if you like, you know, you gotta, I mean, and this is true, right? You have to think like, how could your system possibly be broken, right? Like, like thinking of how to break a system is hard. It's different from thinking of how to build a system, right? It's a different skill set. But like some of those things you should really just be protected from, I think a big, uh, moment in my career was like seeing option I, I'd been lucky enough to have friends that were like exploring with stuff like, um, like haskell, super early on and like common lisp and sort of like, and reading Hacker News, shout out to AJ cuz like, that's his name. But like, there's a, there's a person that was like, just kind of like, sort of like exploring the frontier. And then I would like hear a little bit and be like, Ooh, that's interesting. And like, kind of like, take a look, but option coming in. Like, I think Java 8 was like, wait a second option should be everywhere, right? Because it's like NPEs Null Pointer Exceptions should almost, like, they shouldn't really be a thing, right? Like, and then you are like, Oh, wait, option should be here, but that means it has to be there and it kind of like, it just infects everything. And normally stuff that infects everything is bad, right? You're just like, Oh no, this is bad. I better take it out. But then you're like, Wait a second. But option is right because I don't know if that thing is not a null actually right. Like the language doesn't give me that. So then, you know, you kind of stumble upon non nullable types, right? As a language feature. And so it's, it's really hard to quantify, but I think things like that can make a, a, a, a worthwhile difference in, base choice of language as far as correctness goes and in preventing. But I also know that like, people are just blazing by in rails like, just like absolutely without the care in the world, and they're doing great and they, like, they have the, all the integrations and it's all, it's working out great for them. But I personally just like, I'm just like, I have to, I feel the compulsion. I'm just like, I feel the compulsion and I'm just like, I need to at least do typescript and then I have a little bit more protection from myself. Uh, and then I can, and then I can go on. And it's also, it's like, it's also an excuse for me to like, write less tests as well. Like a little bit like, you know, I'm just like, you know, I, I, I, there's, there's some, there's some, Assurance that I don't have to like go back and actually write that negative one test like the first time, Right. It in practice, like technically you, you, you should, cuz like, you know, at run time it's, it is a completely different world, right? Like typescript is like a compile time thing thing. But if you, if you write your types well enough, you, you, you're, you're protected from some amount of that. And I find that that helps me. Personally. So, so that's the, that's the one place I'm just like, ah, I do like that correctness stuff, [01:08:13] Jeremy: Yeah. Yeah. I, I think like, I, I do agree in a general sense with languages that have static type checking where, you know, at compile time whether something can even run, that can make a big difference. Maybe correctness wasn't the right word, but I you work in an ecosystem, whether Rails or Django or something else, you kind of know all of the, the gotchas, I guess? if you're, if you're, let's say you're building a product with Haskell and you've never Haskell before, I feel like yes, you have a lot of strong guarantees from the type system, but there are going to be things about the language or the ecosystem that you, you'll just miss just because you haven't been in it. And I think that's what I meant by correctness in that you're going to make mistakes, either logical mistakes or mistakes in structure, right? Because if you, if you think about a Rails app, one of the things that I think is really powerful is that you can go to a company day one that uses rails and if they haven't done anything too crazy, you have a general sense of where things are some extent. And when you're building something from scratch in a language and ecosystem you don't understand, um there's just so many scrapes and cuts you have to get before you're proficient right Um, so I, so I think that is more what I was thinking of yeah. [01:10:01] Victor: Oh yeah. I, I'd fully agree with that yeah I fully agree with that. you don't know what you, what you don't know right. When you, uh, when you start, um especially with a new ecosystem, right because you just, everything's hard. You have to go figure out everything you have to go try and decide between two libraries that do similar things despite, you know, like knowing how it's done in another language. But you gotta like figure out how it's done in this language, et cetera. But it's like, well, you know, at least decisions are easier elsewhere sometimes, right? Like, like in the database level or like, maybe the infrastructure level or, but yeah, I, I totally get that. It's just, most of the time you just want to go with that, uh, that faster, that faster thing, you know, Feels funny to say of course. Cuz I never do this (laughs) . for I never, like all my, all my projects on, on essentially crazy stacks. But, but I, I try and I try and be mindful about is how much of my toil right now is even a good idea, right? Like, depending on my goals. Again, like going back to like that, it depends on what you're optimizing for right if you're optimizing for learning or like getting a really good fundamental understanding of something, then yeah, sure. If you're optimizing for like getting to market? Sure. that's a different answer. If you're, if you are optimizing for, like, being able to hire developers to work alongside you, right? Like making it easy to hire teammates in the future, that's a different set of languages maybe. so yeah, I don't know. I kind of give the, the weasel answer, which is, you know, it depends , hmm right? But, um, yeah. [01:11:32] Jeremy: Especially if you're, you're learning or you're doing personal projects for yourself, then yeah, if you, if you want to know how to use haskell better, then yeah, go for it. Use, use haskell, um, uh, or use rust and so on. I think another thing I think about is the deployment so if personal you are running a SaaS or you're running something that you deploy internally, then I think something like Rails, Django is totally fine especially if you use a platform as a service, then there's so many resources for you. But if your goal is to give you an example, like Mastodon, right? So we have the whole,twitter substitute thing. Mastodon is written in Rails and it has a number of dependencies, right? you have to have Sidekiq, which runs the Workers, Elastisearch for search, um, Postgres for the database and Nginx and so on. And for somebody who's running an instance for a bunch of people, totally makes sense, right? No big deal. where I think it's maybe a little trickier is, and I don't know if this is the intent of, Mastodon or ActivityPub in but some people, they wanna host their own instance, right? Um, rather than signing up for mastodon.social and having a whole bunch of people in one instance, they wanna able to control their instance. They wanna host it themselves. And I think for that Rails the, the resources that it requires are a little high for that kind of small usage. So, in an example like that, if I wanted something that I wanted to be able to easily give to you and you could host it, then something like a Go or a Rust I think would make a lot of sense you can run the binary, right? And, you don't have to worry about all the things that go around running a Ruby application. So I think that's something to think about as well. And, and we talked about command line apps as well, right? If you're gonna build a command line app and you want it to run on Windows, well the person on Windows is not gonna have python or ruby so again having it in Go or in Rust makes a lot of sense there so that's another think I would think about who is it going to be given to and who is going to deploy it as well. [01:14:25] Victor: Yeah. That's um, that's a great point, uh, because it makes me think of sort of explosion of sysadmins writing go when it first came out I, don't know if I imagined this or I think it was real, but like there were just so, uh, up until then, like most sysadmins would be they'd like obviously like get to know their routers or their, you know, their switches and their, you know, their servers and like racking, stacking doing all that stuff. Languages and like frameworks can unlock a certain group of people or like unblock a certain group of people and like unlock their sort of productivity. So like Ansible was one of those first things that was like really sort of easy to understand and like, Oh, you can imperatively set this machine up. But a side effect is you get a lot of sysadmins that know Python, right? So like, now a lot of like the sort of black art stuff is accessible to you. Like, or, sorry, I say accessible to you as in accessible to me as the non sysadmin, right? Cause I'm just like, Oh, I can run this like little script this person wrote, uh, in Python and it like, will do all this stuff, right? That I, I would've never been able to do before. And maybe I learned a little bit more about that, about that system, right? And so I, I, I saw something similar and go where people were writing a bunch of tools that were just really easy to run, right? Really, really easy to run everywhere. Um, and that means easy to download, easy to like, you know, everything's easier and, A lot of hard things got a lot easier, right? Uh, and this is same with Rust. Like, I, I believe that library that most people use is like clap, I've built a few things with Clap and it's like, it gives you excellent, uh, I guess you'd call them affordances or like ability to make a high quality CLI program with very little effort, right? Uh, and so that means you end up writing really decent binaries, right? With like, good help texts and like reasonable like, you know, options and stuff like that. and then it's really easy to deploy to Windows, right? And like other, other platforms, uh, like you said, you don't have to try and bundle Python or, or whatever else the sort of interpreter class of languages. So yeah, I think that I'd agree that like just languages and, and, and sort of frameworks can, can unlock, easier creation of certain kinds of apps and certain sort of groups of people to share their knowledge or like to, to, to make a, a tool that's more usable by everyone else. It could be like, kind of like a, multiplicative factor right. Just like, I made this really, really intense Python script, but like now, but to use it, you'd have to like install Python on Windows, like manage your environments, whatever. Like, I don't know if you're using pyenv, maybe you are, maybe you aren't. Do you get the wheel? Like what, what do you do with that? no, I'll just give you a, executable and you have an executable and then now you can use all the tools that like normally work with an executable or with something that like produces output and it's just faster for everybody and everybody like just, you know, gets more value [01:17:17] Jeremy: Cool. Well, is there anything else you wanted to, to mention or, or talk about? [01:17:26] Victor: I don't know. oh yeah, I guess, I guess I could just like say my stack, right? Um, Oh, I, I really love Sveltekit. I've been kind of all in on Sveltekit for the front end for a while now. it feels like I've used, um, I've used nuxt I've used, like, I've used a lot of frameworks, but I'm trying to think of, of frameworks that like, do the, um, like I think, I think a local, if not global maximum for front end development is power of the front end component driven sort of paradigm and server side rendering, right? Because there's like, what are the big advantages of using something like Rails or like whatever else that, that just, just, that's completely server side is that the pages are fast, the pages are always fast. It's there, but they don't have interactivity. Right. we've taken a weird path to get here and it looks really wasteful and maybe it is really wasteful, but at this point we now have kind of both kind of like glued and like hacked into one thing. And I think that class of tools is like, is, is is a local maximum, if not, if not global. so, so yeah. So like, there's like next, nuxt, sveltekit. There's, there's other solutions. There's Astro like there's, there's, which is Astro's really recent. Um, there's Ember, right? Shout out to Ember right. People, people still pushing that forward as well, which is great. but yeah, so I, I've SvelteKit also, and this is again in like direct conflict to what we've talked about this entire time, which is like, use established things that get you there fast. but like SvelteKit isn't at 1.0 yet, but it is excellent. Like, I, I am more productive in it than I ever was with Nuxt. Um, and again, Nuxt has changed a lot since I've, you know, sort of made the switch and like, you know, maybe I, maybe it deserves a rethink and like re revisiting it, but I'm so productive with SvelteKit. I just, like, I don't mind. And like half the time I'll just, I'll just use SvelteKit, uh, and my database and then be done like no middle layer. So like no API layer. I just like stuff it into the SvelteKit app, and then use, postgres on the backend and then I'm done and, and I feel like that's been really productive, you know, again, this is outside of the, the world where you use a rails or whatever. Um, so yeah. So that's, that's been my stack for a lot of the products I've done recently. so yeah, if I, if I had to, I guess say something about like front end, like give SvelteKit a try. It's pretty good. Uh, and obviously like databases, just use Postgres. Stop using other things. don't, don't do that. And like infrastructure stuff, I think Kubernetes is cool, but you probably don't need it. Uh, I like Pulumi. I feel like no one, like I've been recommending Pulumi for a long time over Terraform. So it's just like DSLs have limits and those limits are a bad idea to have when you, like, the rest of your time is spent with no limits, right. With like just general computing. Right. So, and Pulumi is just like, you can do your general computing and infrastructure stuff too, and it's, I feel like it's, it's always, you know, been better, but, but anyway, yeah. That's like, that's kind of my stack [01:20:26] Jeremy: So pulumi is um, it's a way to provision infrastructure, but is there a language for it? [01:20:35] Victor: It integrates with the language you use. And Terraform has caught up in that respect, right? Cause you have that now. but how it works is still slightly different right because if I remember correctly they still generate a Terraform file and execute that file it's, still a little bit different, which is like, it's, and it's AWS' CDK as well, right? So, so the world that's sort of caught up to where, what Pulumi is doing. But you know, I, I think it was like, I don't know, terraform 12 or something like that where it was just like, we've added better for loops. I'm like, okay, at this point, like this is, that's the indication of like, you now need general, like you, you, you're now the dsl, like DSLs can have for loops, but it's like if you're starting to like pluck, you know, general computing languages, we have really good general computing languages right there. You know, that was kind of my, indication to be like, okay, I Pulumi is the way, uh, for me, um again, This doesn't matter cuz like at work you're gonna, you're probably using Terraform, like, you know, just every, just like, there's, you know, everyone's using certain tools and you don't have a choice. Sometimes you have to use certain tools, but I personally have my, uh, have my pet pet likes and stuff. [01:21:49] Jeremy: How about for caching? [01:21:53] Victor: Uh, KeyDB. I go into rabbit holes a lot. I call myself a yak shaver cause I shave a lot of yaks and it doesn't benefit anyone really except for me most of the time. But there are lots of redis alikes out there. And the best feature set is right now KeyDB. There's like, there's, there's one called Tendis there's, um, which is like, um, a little bit like more distributed. There's like SSdb, which will do it off disk, which is, I think because we have such fast disks now, it's good enough for a bunch of applications. Right. Especially if, like, if your alternative was like, you know, a much farther away sort of, you know, calls the farther away service. There's Pelican out of Twitter, so they have a whole, they've got like a caching, it's like a framework kind of, right? Like they, they, they've sort of built a kernel of like really interesting caching, um, originally like sort of to serve their memcache workloads and stuff. But it's kind of grown in like, in lots of directions as well. KeyDB is, was the most compelling and still is to, to me for, from a resource usage. Multi threading, obviously, like it is multi threaded, so it is now, it's it's way faster. Right. Um, and also like it offers flash storage, using the SSD when you can. And, and that's, Those are game changers. Right. And, and of course all the, you know, usual and clusters, right? It clusters without you, you know, paying Redis Labs any money or whatever. Um, which is, which is fine. You know, people opensource projects and, and businesses have to, you know, make money. That is a thing. But yeah, KeyDB is, is my, uh, I, whenever I'm about to spin up redis, I don't, and I spin up uh, also they were bought by Snap or bought hell of an aquihire. I think if, if you, cuz I think sometimes that has like a negative pejorative context to it. Like you didn't, like, oh, you didn't make a billion dollars, you just got aquihired or whatever. But hell of an aquihire. Um, and, and so all of it's like free now, like all of the, like all the, the premium features are becoming free. And I'm like, this is, this is like, I won the lottery, right? Cause um, you know, you get all the, the, the awesome stuff outta KeyDB for, for free. Um, so yeah, Caching KeyDB. I do KeyDB. [01:24:11] Jeremy: KeyDB. I haven't heard of that one. [01:24:14] Victor: Oh yeah, it's, um, yeah it's like keydb.dev. [01:24:17] Jeremy: Oh KeyDB. [01:24:18] Victor: It's awesome. They did YC. [01:24:23] Jeremy: Oh, it uses the Redis wire protocol [01:24:28] Victor: Like Redis is like, is the leader, unless you're using memcached for some other reason and then like obvious like have to use memcached, whatever. But, um, but yeah, Redis is like the sort of app external cache dujour for basically everywhere and when I wanna run Redis, I run KeyDB. [01:24:51] Jeremy: And for search, do you just in search in postgres or turn to something else? [01:24:59] Victor: Oh, you've asked a dangerous question. So I recently did some, uh, some writing. So I, I, I, so recently, um, like this year, I've branched out and done a little bit more experiments in writing for companies that have an interesting you know developer product or sometimes where like, you know, my sort of like interest and stuff just aligned, right? So like, uh, I've worked with, um, OCV Open Core Ventures, um, which is on Sid, if you know Sid from GitLab, That's his, um, his, uh, his fund, uh, and then also Supabase, which does, um, you know, awesome stuff on Postgres. And, you know, it's fully open source that, that company is amazing as well. and search has been a thing. So Postgres has full text search, SQLite has full text search. They both have it built in. they're very good and I think great approximations for like V1s at the very least, maybe even farther. because a lot of the time if someone's in your product and they're searching something's wrong usually, right? Like, like, unless you have vast gobs of data, like this means your UX is not good enough for something, right? Um, but um, that said, I almost always start with Postgres full text search. and then I have the, um, there, there are, there's a huge crop of new search engines, right? So if we consider open search to be new, as in like the fork of Amazon from, from Elastic search, there's that, there's a project called Meilisearch. There's a project called TypeSense. Um, there's Sonic, uh, there's like, um, Tantivy, uh, which which is like the, can be under net. There's like quickwit, which is like shifted to logging a little bit. Like that's their like, path to sort of, um, profitability. I, I think, I think they, they sort of shifted a little bit. there's a bunch more that I'm, I'm missing. And so that's what I wrote about and had a lot of fun writing about for Supabase very recently. And this was, um, this was something I just had written down, right? So I was just like, I need to do a blog post. And I, I write on my blog a lot, so I'm just like, Alright. I write up yak shaves to my blog a lot and I'm, and I was just like, I need to try and just use some of these, right? Because there's so many and they all look pretty good. And they have to have learned, like the golden standard is like, uh, solr, right? Lucene, right? Like, it's like, it's like solr and lucene and like, you know, that or whatever. And, but a lot of times you just don't need, like, you don't necessarily need every single feature of lucene. And so there are so many new projects that are look decent. Uh, and so I got a chance to, to to sort of, I was paid to do some of that experimentation, which is awesome cause I would've done it anyway. But it's nice to be paid to do it, on search stuff. and I actually have a project I like, I liked that so much that I made a project to try and get a more representative dataset. So I started a site called podcastsaver.com I use the podcast index, right? Which has a lot of sort of like podcast information. And, know, if someone doesn't know about podcasts, there's like an RSS feed, right? Which is kind of like a, you can think of an XMLy uh, format where people like podcasts are just a publish of a RSS feed and the RSS feed has links to where to download the actual files, right? So it's really open, right? Um, and so I used, um, that the structure of that to index, in multiple search engines at once, right? Running alongside each other, the information from the podcast index. this is was fun for me cuz it was like an extension of that other project. It was a really good way to test them against each other. Very fast, right? Like, or, or like in real time. So like right now, um, if you go to podcastsaver.com and you search a podcast, it will go to one of the search engines randomly. So right now there is Postgres FTS, plus Trigram. So, so there is, um, there's also a thing called, um, Tri Trigram searches another really good like, um, sort of basic search feature. And there's Meilisearch. So both of those are implemented right now. And there's actually a little nerds link, right? Which will show you how many, how many podcasts there are, right? So, so how many documents, essentially you can kind of assume there are. Um, and it'll show you how fast each search engine did, right? At sort of returning an answer. Now it's a little bit of a problem because I don't you need to do some manual work to figure out whether the answer was good, right? If you're really fast but give a garbage answer, that's not good. But in general, like, so you can, you can actually use the nerd tab to control, You can like switch to only Postgres, uh, and I do that with like cookies and you can, um, you can force it to go to Postgres and you can see the quality of the answers for yourself. But they're generally, it's pretty good from both. Like it's not, it's not terrible from, from both. So I'm, I'm kind of like glossing over that part right now, but you can see the performance and it's actually, it's like meilisearch does a great job, right? Um, and you know, there's obviously some complexity in running another service and there's some other caveats and stuff like that, but it's, it's pretty good. And over time, I want to add more. So I wanna add, you know, at the very least typesense, like people have reached out, so like, I, I made a, a comment on this, on Hacker news and like there's a long road ahead for that and like, I honestly shouldn't be working on that cuz I have other things that I'm like, you know, I, I'm really should be full time on. Um, But like, that's a thing I'm trying to, I'm trying to do sort of grow in the future a little bit more cuz it's just like, it's so fascinating to, to like, everything's so cheap. Like computer is cheap, you know, like there's awesome projects out there with like really advanced functionality that we can just run, like, for free or not, not for free, but like, you don't have to do the work to like build a search engine. There's like five out there. So all you, the only thing that's missing is like knowing which one's the best fit for you and like, you can just find that out. Yeah. [01:30:46] Jeremy: Are there any I guess early conclusions in terms of you like Meilisearch because of X or? [01:30:53] Victor: Yeah, the, the super supabase blog post, uh, was, was a little bit better in terms of, uh, takeaways. I can say that from like meilisearch is definitely faster like meilisearch was harder for me to load and like it took a, a little bit longer cuz you know, you have to do the network call. And to be fair, if you choose Postgres, it's in the database. So like, your copying is a lot easier. Like, manipulating stuff is a lot easier. Um, but right now when I look at the stats, like Meilisearch goes way faster. It's like almost always under a hundred milliseconds, right? And that's including, you know, um, that network, you know, round trip. Um, but you know, Postgres is like, I don't know, I just, I just, I think it's, I I'm just so, I'm so biased. Like it is not a good idea to ever bet against Postgres, right? Like, obviously meilisearch is be like, it doesn't make sense for Postgres to be better than purpose-built tools. Um, because they are fully focused, right? Like, they should be, they should be optimal. Cuz they, they, they don't have any other sort of conflicting constraints to think about. But Postgres is very good. It's just like, it's, it's so excellent and it, it keeps moving. Like it keeps getting better. It gets better and better every year, every like, every quarter. It's hard to not bet on it. So I often, So, so, so yeah, so I just, I, if you, I, I would say based on pure performance of podcastserver.com right now, the data lends itself to saying pick meilisearch. unfortunately that data set is incomplete. I don't have typesense up. I don't have all these other like search engines up. So, so it's, it's, it's limited. there was also, like in the supabase post, you'll see there, there was support for like, um, misspellings and stuff was different among search engines. So there's also that axis as well. But if you happen to be running on Postgres, I really do suggest just, just give Postgres FTS a try, even if it was just Trigram search. Like even if you just do Trigram search and do like a sort of like fuzzy search bar, cause that's probably like what a V1 would look like. Anyway, try that and then go off and like, you know, and then like, if you need like crazy faceting or like, you know, you know, really advanced features, then jump off. Uh, but I, I don't know, that's not interesting cause I feel like it already kind of confirms what I think. So I think other people, other people need to need to do this. I need other people to please replicate, uh, and uh, come up with better, better ideas than I have [01:33:20] Jeremy: but I think that's a good start in, in terms of when you're comparing different solutions, whether it's databases or, I don't know what you call these, but what do you call an elasticsearch? [01:33:32] Victor: Search engine. [01:33:34] Jeremy: You go to open source projects or the company websites and they'll have their charts and go we're x times faster than Y. But I, I think actually having a running instance where they're going against the same data, I think that's, that's helpful really for anyone trying to compare something to for someone having gone through the time. And I think that a lot of other things too not just search engines where you could have hey, I have my system and it's got, uh I don't know five different databases or something like that. I, I'm not sure the logistics of how you would do it, [01:34:15] Victor: Like with redis. Like just like all the Redis likes, like just all run, run 'em all at the same time. Someone needs to do that [01:34:26] Jeremy: Could be you. [01:34:27] Victor: Ahaha no! I do too much! Like the redis thing is obvious, right? Redis is easier, like comparing these redises and there's some great blog posts out there also that like kind of do it. But like a running service is like a really good way of like showing like, oh, this is like, we hit this cache, you know, x times a second with like, and it's like this, like naturally random sort of traffic. This is how it performed, this is how they performed against each other. These were like the, the resources allotted or whatever. But yeah, that stuffs, that stuffs really cool. I feel like people haven't done it or aren't doing it enough. [01:35:01] Jeremy: Yeah. I guess thing about, putting together one of these tests as well, especially when you make it live is then you have to spend the time and spend the money to maintain it right and I think, uh, if somebody's not paying you to do it's gotta be uh, Yeah. You gotta want it that bad to put it together. [01:35:22] Victor: Hey, but you know what? we can go full circle just use Kubernetes, Its easy if you just use Kubernetes man. [01:35:33] Jeremy: First you gotta learn... Where, where were we? First start with postgres, kubernetes. [01:35:42] Victor: Yeah. If you wanna use Kubernetes first, you start with Postgres and... It's like, what? [01:35:49] Jeremy: So, learn these ten other things first then you can start to build your project. [01:35:58] Victor: Yeah, it's silly but I know people out there have the knowledge I just feel like it's, it's like, you they just need to do some of this stuff, right? Like, it's just like, they just like need to like, have the idea or just like go, just go try it Uh, and hopefully we, get more of like, in the future. Just like, cause, cause at some point, like there's gonna be so much choice that you're like, how are you gonna decide? How does anyone decide these days? Right? Like, you know, more people have to dedicate their time to like, trying out different things, but also sharing it. Cause I think just inside companies, you do this, you do the bakeoffs, right? Everyone does the bakeoffs to try and figure out, you know, within a week or whatever, whether, whether they should use, let's say like Buddy Base or App Smith, right? Like, just like you, just like the rest of the team has no idea what those are, right? But someone, Does the Bakeoff maybe start sharing Bakeoffs? There it is. There's another app idea. I, I think of a lot of ideas, and this is a, there's another one, right? Just make a site where people can share their bakeoff, like just share their bakeoff results with certain products. And then that knowledge just being available to people is like, is massively, is massively valuable. And it, it kind of helps, it helps the products that are mentioned because they can figure out what to change, right? it kind of makes the market more efficient, right? In that vague, uh, capitalistic sense where it's like, oh, like then, you know, if everyone has a chance to improve, then we get a better product at the end of the day. But, um, yeah, I dunno, Hopefully more people more people yak shave, please, more people waste your time. Uh, not waste, uh, use your time to, uh, to yak shave. It's, it's, it's fine. [01:37:32] Jeremy: Well I think you have something at the end of it sometimes you can yak shave and at the end it's kind of like, well, I, I played with it and oh well. Versus you having something to show for it. [01:37:50] Victor: Yeah, that's true. Yeah. I won't talk about all the other projects that went absolutely nowhere. But, uh, but yeah, I think you always feel selfish if you learn something to, and I should, I should rephrase this like I am definitely a selfish person. Like you know, like, I'm not, this is not altruism, right? It's just like, but at some point it feels like, man, someone should really know this other stuff, right? Like, if you, if you've found something that's like, interesting, like it's, it's like someone should know, cuz someone who's better at it will be like, Oh, like no, this part and this part. Like, it's like everyone kind of wins. which is, which is awesome. So, I dunno, maybe if more people have that feeling, they'll like, they'll like share some of their stuff and like maybe you do a thing and it doesn't help you, but then someone else comes along and they're like, Oh, because I read this, I know how to do this. And like, and then if they give that back too, it's, uh, it's pretty awesome. But anyway, that's all pie in the sky, [01:38:57] Jeremy: I think in general, the fact that you are running a blog and, you know, you do your posts on Hacker News and, and so on. The fact that you're sharing what you've learned, I think it's is super valuable. And I think that goes for anybody who is learning a new technology or working on a problem and you run into issues or things you get stuck on for sure yeah you should share that and the way I've heard described There's always someone on the internet just waiting to tell you why you're wrong. [01:39:35] Victor: Oh yeah. Yeah. [01:39:36] Jeremy: And provided that they're right. That can be very helpful. Right? [01:39:40] Victor: Yeah. Yeah. I, I actually, I love I, I personally like it because if you're a hacker in the, you know, hacker news sense that's excellent. That's like a free compiler right? It's like a free checker right? If you just sit next to someone who is amazing at X. And you just start bouncing ideas of like, around X and like how to do whatever it is off of them, you get it compiled. They're just like, No, you can't do that cuz of X, Y, and Z. And you're like, Oh, okay, great. I've just saved myself like, you know, months of like thinking I could do it and like, now I know I can't do it. And the internet is great cuz it gives you access to like, to those people who are like, Yeah. And knowing it first, but if you realize that like, oh, they've chosen to share some wisdom with me like that, like, you know, or, or like trying to, Right. Assuming you're correct, Like, even if they're not correct. Um, it's, it's, it's pretty awesome. So, so I personally welcome that. Of course it doesn't feel good to be wrong, right? I don't like that. But, um, I love it when someone like take, took the time to be like, No, your, your view on this is wrong because of this. Or like, you know, like 99% of the time you don't need that. You should have just done this, right? Cause then I learn, a lot of my posts will have updates at the top. Right. So like when someone, like, you know, when I posted the, the thing about the throat mic to like hack me is people were like, This sounds terrible, I was like, I didn't think it was that bad, but, uh, but I was like, you know, maybe I, maybe I shouldn't use this, uh, all the time, but it, it, you know, it was, it was like obvious that, oh, I should have, I should have never made the post without including a sample of the audio at the top, right? So like, I like went back and like an update for that and then, and then people like discussing about like, Oh, you should have used a bone conducting mic instead. Like, and like all this other stuff that I just like didn't think about. I'm like, Oh, awesome. And then like I update the post I go on with my life, so anyway, more people please do that and don't post it on Medium. Please don't do that. Stop, stop that. If you like, if you, if you write software, do not like, please put it some, put your writing about software somewhere else, unless, I don't know, You have to or something. [01:41:52] Jeremy: You've reached your article limit. [01:41:57] Victor: Yeah, yeah. Oh, also shout out to the web archive. The best way to get almost any article, right? I don't think people in the general populace know this? But like 99% of the time if you're trying to you just go to the web archive. It's common knowledge for us. Um, but, but it's not Common knowledge for everybody else and it just feels like they're making a lot of stuff available and legally, right. Cuz like, you know, there's like the, the precedent right now I think is, is is in favor of scraping, right? If you make a thing available to the internet, right? LinkedIn got ruled against a while ago, but like, if you make a thing available to the internet, uh, publicly available without signing in or whatever it is assumed public, right? So it's just like, yeah, whenever I read something I'm just like, ah, article limit. I hop right on. I hop right on archive today. But, but I just feel like it's like, it's, it's sad that developers put like, put knowledge enmasse into that particular, It's not a trap. Cause I don't, it's like I don't dislike medium, I don't have any necessarily like animosity towards medium, but it's just like we should be the most capable of, putting up something like maintaining our own websites. Right. If it's like the death of the personal website, why is it dying with developers? Like, we should be the most capable. We have no hope of the regular world putting out websites if, if it's hard for us. [01:43:32] Jeremy: I, I mean, I think for stuff like medium maybe sometimes it's the, the technical aspect of not wanting to set up your own site but, I think a large part of it is the social aspect. Like with Medium, you have discoverability you have the likes system, if they even call it that. Um, I think that's the same reason why people can be happy to post on twitter, right? Um, but when it comes to posting on their own blog, it's like well, I post and then nobody comes and sees it, right? Or I don't get the, I don't get the, Well, the thing is too, like, they could be seeing it but you don't get the feedback and you don't get, you don't get the dopamine hit of like, Oh, I got 40 likes on Medium or Twitter or whatever. And I think that's one of the challenges with personal sites where I totally agree with you. Wish people would do it and do more but I also understand you are on a little bit of an island unless you can get people to come and interact with you. [01:44:44] Victor: There's another idea, right? Like just, you know, can you build a self hostable, but decentralized by default, medium clone. there's that's like a personal site that you could easily host you know, like, almost like WordPress, like let's say, right? Um, but with the, with enough metrics, with like, with the engagement stuff built in, even though it's not like powering a company essentially, right? Cause like the incentives behind building in the engagement, like pumping up engagement. Make sense? If you're running a company cuz you like, you know, you're trying to get MAUs up so you can do your next round or like, you know, make more revenue. Wonder if, I don't know. Yeah, it's just like, like that is a great point cuz it's like, you don't get the positive reinforcement if you don't have the likes and the things that a company would add, right? Like, as opposed to just like, Oh, I set up nginx and like my site's up or whatever. Like, not that anyone does that these days, but, yeah, that's, that's that's interesting. It's just like, could you make it really like just increasing the engagement of doing it yourself or like, you know, having that. Huh. [01:45:56] Jeremy: I think sites have, have tried, I mean, it's not quite the same thing, but, dev.to, if you've seen that, like, uh, they, they have, um, I can't remember what it's called, I think it's like a canonical link or something. but basically you can post on their site and then you can put the canonical link to your own website. And so then when somebody searches on Google, the, the traffic goes to your site. It doesn't bring up dev.to. And then, people can comment and like on dev.to so I thought it was an interesting idea. I, I don't know how many people use it or take advantage but that's one approach anyways. [01:46:44] Victor: Yeah, that's actually, that's cool. I don't know enough about that space. I guess. That sounds awesome. That sounds like actually, you know, useful and like a good middle ground right in like encouraging the ecosystem but also like capturing some of that, of that value, right? In terms of like just SEO juice, I guess, if you wanna, what, what you wanna call it. But that's awesome. I don't know, I, I, I've always thought of like dev.to And, and clearly I was, you know, at least wrong in part of dev.to Is just like medium 2.0 for, but more developer focused. Um, but I will find great blog posts on there, um, you know, more often than not, and it's just like, okay, yeah, that's, that's awesome. Like, it, it, it works. Uh, and this canonical link thing sounds actually like very good for, um, for everybody involved, so. Awesome. Sounds like they're, they're good. [01:47:36] Jeremy: Yeah, if people wanna check out you're up to, what, what, you're working on, where should they head? [01:47:43] Victor: Oh God. Uh, well, like, I have my blog at, um, vadosware.io, so V A D O S WARE projects I work biggest ones right now. Oh, I guess three. Um, uh, like I, we mentioned Podcast Saver, which is cool. Uh, if you need to download podcasts, do that. Um, I send out ideas. I send out ideas every week that I think are like valuable. valuable and like things you could turn up into like a startup or a SaaS and like kind of focus on like validating. Cuz like one thing I've learned the hard way is that validating ideas is more important than having them. Uh, cuz you can think something is good and it won't, won't attract anybody. Um, or you know, if you don't put it in front of people, they'll, it's not gonna take off. so I do that. I send that out at like unvalidatedideas.com So that's, that's a, you know, that's the domain. I also started, um, trying to highlight FOSS projects cuz in yak shaving what you do is you come across a lot of awesome free and open source projects that are just like, oh, like this is a whole world and like this is like pretty polished and it's like pretty good and I just bookmark So I was just like, I have so many bookmarks, it doesn't make sense that I hold all of them. Um, and like I, someone else has, should see this. So I send out, and this is uh, new for me cuz I send out that newsletter every day. So it's a daily newsletter for like free and open source projects that do, you know, do whatever, like, do lots of various things. And that is at Awesome Foss. So you can actually spell it multiple ways, but a w s m f o s s.com. So like, awesome without the vowels. Um, but also just if you spell it normally like a normal person, like awesome the word f o s s.com. Um, so that's, that's going. And then the, the thing that's actually like taking up all my time is nimbus, um, Nimbus Web Services is what I'm calling it. Uh, it's not out yet, there's nothing to try there, but it is, it is my attempt, to host free and open source software. But give, 10-30% back of revenue, so not profit. Right. Cause they can be different things and like, you know, see the movie industry for like, how that can go wrong, of revenue back to open source projects that, uh, that made the software that I'm hosting. And I, I think there's more interesting things to be done there, right? Like it can, I can be more aggressive with that. Right. If it, if it works out. Cuz it's just like, you know, it scales so well, you know, see Amazon, right. but yeah, so if you're, if you're interested in that checkout, nimbusws.com. And that's it. I've, I've plugged everything. Everything plugged. [01:50:38] Jeremy: Yeah that last one sounds pretty, pretty ambitious. So good luck. [01:50:42] Victor: Thanks for taking the time.

Oct 1, 2022 • 54min

Xe Iaso on Tailscale

Xe Iaso is the Archmage of Infrastructure at Tailscale and previously worked at Heroku.This episode originally aired on Software Engineering Radio but includes some additional discussion about their blog near the end of the episode.Topics covered:Use cases for VPNsSimplifying service authentication by identifying users via IPPeer-to-peer vs centralized "Virtual Pain Networks"Tailscale's tech stack and why they forked the go compilerDERP relay serversStruggling with the iOS network extension size limitThe surprisingly small amount of infrastructure required to run a VPNRunning your company on your own productWorking at Heroku vs TailscaleUsing the socratic style of debate in technical blog postsRelated Links@theprincessxenaXe's BlogACL samplesGo links origin storyHow Tailscale worksTailscale SSHHow Tailscale assigns IP addressesHey linker, can you spare a meg?My Blog is Hilariously Overengineered to the Point People Think it's a Static SiteThe Sheer Terror of PAMTranscript[00:00:00] Jeremy: Today I'm talking to Xe Iaso, they're the archmage of infrastructure at tailscale, and they also have a great blog everyone should check out. Xe, welcome to software engineering radio.[00:00:12] Xe: Thanks. It's great to be here. [00:00:14] Jeremy: I think the first thing we should start with, is what's a, a VPN, because I think some people they may have used it to remote into their workplace or something like that. But I think the, the scope of what it's good for and what it does is a lot broader than that. So maybe you could talk a little bit about that first.[00:00:31] Xe: Okay. a VPN is short for virtual private network. It's basically a fake network that's overlaid on top of existing networks. And then you can use that network to do whatever you would with a normal computer network. this term has been co-opted by companies that are attempting to get into the, like hide my ass style market, where, you know, you encrypt your internet information and keep it safe from hackers.But, uh, so it makes it really annoying and hard to talk about what a VPN actually is. Because tailscale, uh, the company I work for is closer to like the actual intent of a VPN and not just, you know, like hide your internet traffic. That's already encrypted anyway with another level of encryption and just make a great access point for, uh, three letter agencies.But are there, use cases, past that, like when you're developing a piece of software, why would you decide to use a VPN outside of just because I want my, you know, my workers to be able to get access to this stuff.[00:01:42] Xe: So something that's come up, uh, when I've been working at tailscale is that sometimes we'll make changes to something. And it'll be changes to like the user experience of something on the admin panel or something. So in a lot of other places I've worked in order to have other people test that, you know, you'd have to push it to the cloud.It would have to spin up a review app in Heroku or some terrifying terraform of abomination would have to put it out onto like an actual cluster or something. But with tail scale, you know, if your app is running locally, you just give like the name of your computer and the port number. And you know, other people are able to just see it and poke it and experience it.And that basically turns the, uh, feedback cycle from, you know, like having to wait for like the state of the world to converge, to, you know, make a change, press F five, give the URL to a coworker and be like, Hey, is this Gucci?they can connect to your app as if you were both connected to the same switch.[00:02:52] Jeremy: You don't have to worry about, pushing to a cloud service or opening ports, things like that.[00:02:57] Xe: Yep. It will act like it's in the same room, even when they're not it'll even work. if you're at both at Starbucks and the Starbucks has reasonable policies, like holy crap, don't allow devices to connect to each other directly. so you know, you're working on. Like your screenplay app at your Starbucks or something, and you have a coworker there and you're like, Hey, uh, check this out and, uh, give them the link.And then, you know, they're also seeing the screenplay editor.[00:03:27] Jeremy: in terms of security and things like that. I mean, I'm picturing it kind of like we were sitting in the same room and there's a switch and we both plugged in. Normally when you do something like that, you kind of have, full access to whatever else is on the switch. Uh, you know, provided that's not being blocked by a, a firewall.is there like a layer of security on top of that, that a VPN service like tailscale would provide.[00:03:53] Xe: Yes. Um, there are these things called access control lists, which are kind of like firewall rules, except you don't have to deal with like the nightmare of writing an IP tables rule that also works in windows firewall and whatever they use in Mac OS. The ACL rules are applied at the tailnet level for every device in the tailnet.So if you have like developer machines, you can put people into groups as things like developers and say that developer machines can talk to production, but not people in QA. They can only talk to testing and people on SRE have, you know, permissions to go everywhere and people within their own teams can connect to each other. you can make more complicated policies like that fairly easily.[00:04:44] Jeremy: And when we think about infrastructure for, for companies, you were talking about how there could be development, infrastructure, production, infrastructure, and you kind of separate it all out. when you're working with cloud infrastructure. A lot of times, there's the, I always forget what it stands for, but there's like IAM.There's like policies that you can set up with the cloud provider that says these users can access this, or these machines can access this. And, and I wonder from your perspective, when you would choose to use that versus use something at the, the network or the, the VPN level.[00:05:20] Xe: The way I think about it is that things like IAM enforce, permissions for like more granularly scoped things like can create EC2 instances or can delete EC2 instances or something like that. And that's just kind of a different level of thing. uh, tailscale, ACLs are more, you know, X is allowed to connect to Y or with tailscale, SSH X is allowed to connect as user Y.and that's really different than like arbitrary capability things like IAM offers.you could think about it as an IAM system, but the main permissions that it's exposing are can X connect to Y on Zed port.[00:06:05] Jeremy: What are some other use cases where if you weren't using a VPN, you'd have to do a lot more work or there's a lot more complexity, kind of what are some cases where it's like, okay, using a VPN here makes a lot of sense.(The quick and simple guide to go links https://www.trot.to/go-links) [00:06:18] Xe: There is a service internal to tailscale called go, which is a, clone of Google's so-called go links where it's basically a URL shortener that lives at http://go. And, you know, you have go/something to get to some internal admin service or another thing to get to like, you know, the company directory and notion or something, and this kind of thing you could do with a normal setup, you know, you could set it up and have to do OAuth challenges everywhere and, you know, have to put and make sure that everyone has the right DNS configuration so that, it shows up in the right place.And then you have to deal with HTTPS um, because OAuth requires HTTPS for understandable and kind of important reasons. And it's just a mess. Like there's so many layers of stuff like the, the barrier to get, you know, like just a darn URL, shortener up turns from 20 minutes into three days of effort trying to, you know, understand how these various arcane things work together.You need to have state for your OAuth implementation. You need to worry about what the hell a a JWT is (sigh) . It's it it's just bad. And I really think that something like tailscale with everybody has an IP address. In order to get into the network, you have to sign in with your, auth provider, your, a provider tells tailscale who you are.So transitively every IP address is tied to an owner, which means that you can enforce access permission based on the IP address and the metadata about it that you grab from the tailscale. daemon, it's just so much simpler. Like you don't have to think about, oh, how do I set up OAuth this time? What the hell is an oauth proxy?Um, what is a Kubernetes? That sort of thing you just think about like doing the thing and you just do it. And then everything else gets taken care of it. It's like kind of the ultimate network infrastructure, because it's both omnipresent and something you don't have to think about. And I think that's really the power of tailscale.[00:08:39] Jeremy: typically when you would spin up a, a service that you want your developers or your system admins, to be able to log into, you would have to have some way of authenticating and authorizing that user. And so you were talking about bringing in OAuth and having your, your service understand that.But I, I guess what you're saying is that when you have something like tailscale, that's kind of front loaded, I guess you, you authenticate with tail scale, you get onto the network, you get your IP. And then from that point on you can access all these different services that know like, Hey, because you're on the network, we know you're authenticated and those services can just maybe map that IP that's not gonna change to like users in some kind of table. Um, and not have to worry about figuring out how do I authenticate this user.[00:09:34] Xe: I would personally more suggest that you use the, uh, whois, uh, look up route in the tailscale daemon's local API, but basically, yeah, you don't really have to worry too much about like the authentication layer because the authentication layer has already been done. You know, you've already done your two factor with Gmail or whatever, and then you can just transitively push that property onto your other machines.[00:10:01] Jeremy: So when you talk about this, this whois daemon, can you give an example of I'm in the network now I'm gonna make a service call to an application. what, what am I doing with this? This whois daemon?[00:10:14] Xe: It's more of like a internal API call that we expose via tailscaled's, uh, Unix, socket. but basically you give it an IP address and a port, and it tells you who the person is. It's kind of like the Unix ident protocol in a way, except completely not. And at a high level, you know, if you have something like a proxy for Grafana, you have that proxy for Grafana, make a call to the local tailscale daemon, and be like, Hey, who was this person?And the tailscale, daemon will spit back at JSON object. Like, oh, it's this person on this device and there you can do additional logic like maybe you shouldn't be allowed to delete things from an iOS device, you know, crazy ideas like that. there's not really support for like arbitrary capabilities and tailscaled at the time of recording, but we've had some thoughts would be cool.[00:11:17] Jeremy: would that also include things like having roles, for example, even if it's just strings, um, that you get back so that your application would know, okay. This person, is supposed to have admin access to this service based on what I got back from, this, this service.[00:11:35] Xe: Not currently, uh, you can probably do it via convention or something, but what's currently implemented in the actual, like, source code and user experience that they, you can't do that right now. Um, it is something that I've been, trying to think about different ways to solve, but it's also a problem.That's a bit big for me personally, to tackle.[00:11:59] Jeremy: there's, there's so many, I guess, different ways of doing it. That it's kind of interesting to think of a solution that's kind of built into the, the network. Yeah.[00:12:10] Xe: Yeah. and when I describe that authentication thing to some people, it makes them recoil in shock because there's kind of a Stockholm syndrome type effect with security, for a lot of things where, the easy way to do something and the secure way to do something are, you know, like completely opposite and directly conflicting with each other in almost every way.And over time, people have come to associate security or like corporate VPNs as annoying, complicated, and difficult. And the idea of something that isn't annoying, complicated or difficult will make people reject it, like just on principle, because you know, they've been trained that, you know, VPN equals virtual pain network and it, it's hard to get that association outta people's heads because you know, a lot of VPNs are virtual pain networks.Like. I used to work for Salesforce and Salesforce had this corporate VPN where no matter what you did, all of your traffic would go out to the internet from their data center. I think it was in San Francisco or something. And I was in the Seattle area. So whenever I had the VPN on my latency to Google shot up by like eight times and being a software person, you know, I use Google the same way that others breathe and it, it was just not fun.And I only had the VPN on for the bare minimum of when I needed it. And, oh God, it was so bad.[00:13:50] Jeremy: like some people, when they picture a VPN, they picture exactly what you're describing, where all of my traffic is gonna get routed to some central point. It's gonna go connect to the thing for me and then send the result back. so maybe you could talk a little bit about why that's, that's maybe a wrong assumption, I guess, in the case of tailscale, or maybe in the case of just more modern VPN solutions.[00:14:13] Xe: Yeah. So the thing that I was describing is what I've been lovingly calling the, uh, single point of failure as a service type model of VPN, where, you know, you have like the big server somewhere, it concentrates all the connections and, you know, like does things to make the computer feel like they've teleported over there, but overall it's a single point of failure.And if that falls over, you know, like goodbye, VPN. everybody's just totally screwed. And in contrast, tailscale does a more peer-to-peer thing so that everyone is basically on equal footing. Everyone can send traffic directly to each other, and if it can't get directly to there, it'll use a network of, uh, relay servers, uh, lovingly called Derp and you don't have to worry about, your single point of failure in your cluster, because there's just no single point of failure.Everything will directly communicate as much as possible. And if it can't, it'll still communicate anyway.[00:15:18] Jeremy: let's say I start up my computer and I wanna connect to a server in a data center somewhere at the very beginning, am I connecting to some server hosted at tailscale? And then. There's some kind of negotiation process where after that I connect directly or do I just connect directly straight away?[00:15:39] Xe: If you just turn on your laptop and log in, you know, to it signs into tailscale and gets you on the tailnet and whatnot, then it will actually start all connections via Derp just so that it can negotiate the, uh, direct connection. And in case it can't, you know, it's already connected via Derp so it just continues the connection with Derp and this creates a kind of seamless magic type experience where doing things over Derp is slower.Yes, it is measurably slower because you know, like you're not going directly, you're doing TCP inside of TCP. And you know, that comes with a average minefield of lasers or whatever you call it. And it does work though. It's not ideal if you wanna do things like copy large amounts of data, but if you want just want ssh into prod and see the logs for what the heck is going on and why you're getting paged at 3:00 AM. it's pretty great.[00:16:40] Jeremy: What you, you were calling Derp is it where you have servers kind of all over the world and somehow it determines which one's, I guess, is it which one's closest to your destination or which one's closest to you. I'm kind of[00:16:54] Xe: It's really interesting. It's one of the most weird distributed systems, uh, type things that I've ever seen. It's the kind of thing that could only come outta the mind of an X Googler, but basically every tailscale, every tailscale node has a connection to all of the Derp servers and through process of, you know, latency testing.It figures out which connection is the fastest and the lowest latency. And it calls that it's home Derp but because it's connected to everything is connected to every Derp you can have two people with different home Derps getting their packets relayed too other clients from different Derps.So, you know, if you have a laptop in Ottawa and a laptop in San Francisco, the laptop in San Francisco will probably use the, uh, Derp that's closest to it. But the laptop in Ottawa will also use the Derp that's closest to it. So you get this sort of like asynchronous thing, and it actually works out a lot better in practice, than you're probably imagining.[00:17:52] Jeremy: And then these servers, what was the, the technical term for them? Are they like relays or what's[00:17:58] Xe: They're relays. Uh, they only really deal with encrypted wire guard packets, and there's, no way for us at tailscale, to see the contents of Derp messages, it is literally just a forwarder. It, it literally just forwards things based on the key ID.[00:18:17] Jeremy: I guess if tail scale isn't able to decrypt the traffic, is, is that because the, the keys are only on the user's devices, like it's on their laptop and on the server they're trying to reach, or[00:18:31] Xe: Yeah. The private keys are live and die with those devices or the devices they were minted on. And the public keys are given to the coordination server and the coordination server spreads those around to every device in your tailnet. It does some limiting so that like, if you don't have ACL access to something, you don't get the private key, you don't get the, uh, public key for it.The public key, not the private key, the public key, not the private key. And yeah. Then, you know, you just go that way and it'll just figure it out. It's pretty nice.[00:19:03] Jeremy: When we're kind of talking about situations where it can't connect directly, that's where you would use the relay. what are kind of the typical cases where that happens, where you, you aren't able to just connect directly?[00:19:17] Xe: Hotel, wifi and paranoid network security setups, hotel wifi is the most notorious one because you know, you have like an overpriced wifi connection. And if you bring, like, I don't know like, You you're recording a bunch of footage on your iPhone. And because in, 2022. The iPhone has the USB2 connection on it.And you know, you wanna copy that. You wanna use the network, but you can't. So you could just let it upload through iCloud or something, or, you know, do the bare minimum. You need to get the, to get the data off with Derp it wouldn't be ideal, but it would work. And ironically enough, that entire complexity involved with, you know, doing TCP inside of TCP to copy a video file over to your laptop might actually be faster than USB2, which is something that I did the math for a while ago.And I just started laughing.[00:20:21] Jeremy: Yeah, that that is pretty, pretty ridiculous [00:20:23] Xe: welcome to the future, man (laughs) .[00:20:27] Jeremy: in terms of connecting directly, usually when you have a computer on the internet, you don't have all your ports open, you don't necessarily allow, just anybody to send you traffic over UDP and so forth. let's say I wanna send, UDP data to a, a server on my network, but, you know, maybe it has some TCP ports open. I I'm assuming once I connect into the network via the VPN, I'm able to use other protocols and ports that weren't necessarily exposed. Is that correct?[00:21:01] Xe: Yeah, you can use UDP. you can do basically anything you would do on a normal network except multicast um, because multicast is weird.I mean, there's thoughts on how to handle multicast, but the main problem is that like wireguard, which is what is tail tailscale is built on top of, is, so called OSI model layer three network, where it's at like, you know, the IP address level and multicast is a layer two or data link layer type thing.And, those are different numbers and, you can't really easily put, you know, like broadcast packets into IP, uh, IPV4 thinks otherwise, but, uh, in practice, no people don't actually use the broadcast address.[00:21:48] Jeremy: so for someone who's, they, they have a project or their company wants to get started. I mean, what does onboarding look like? What, what do they have to do to get all these devices talking to one another?[00:22:02] Xe: basically you, install tail scale, you log in with a little GUI thing or on a Linux server, you run tailscale up, and then you all log to the, to a, like a G suite account with the same domain name. So, you know, if your domain is like example.com, then everybody logs in with their example.com G suite account.And, there is no step three, everything is allowed and everything can just connect and you can change the permissions from there. By default, the ACLs are set to a, you know, very permissive allow everyone to talk to everyone on any port. Uh, just so that people can verify that it's working, you know, you can ping to your heart's content.You can play Minecraft with others. You can, you know, host an HTTP server. You can SSH into your development box and and write blog post with emacs, whatever you want.[00:22:58] Jeremy: okay, you install the, the software on your servers, your workstations, your laptops, and so on. And then at, after that there's some kind of webpage or dashboard you would go in and say, I want these people to be able to access these things and [00:23:14] Xe: Mm-hmm [00:23:15] Jeremy: these ports and so on.[00:23:17] Xe: you, uh, can customize the access control rules with something that looks like JSON, but with trailing commas and comments allowed, and you can go from there to customize basically anything to your heart's content. you can set rules so that people on the DevOps team can access everything, but you know, maybe marketing doesn't need access to the production database.So you don't have to worry about that as much.[00:23:45] Jeremy: there's, there's kind of different options for VPNs. CloudFlare access, zero tier, there's, there's some kind of, I think it's Nebula from slack or something like that. so I was kind of curious from your perspective, what's the, difference between those kinds of services and, and tailscale.[00:24:04] Xe: I'm gonna lead this out by saying that I don't totally understand the differences between a lot of them, because I've only really worked with tailscale. I know things about the other options, but, uh, I have the most experience with tailscale but from what I've been able to tell, there are things that tailscale offers that others don't like reverse mapping of IP addresses to people, or, there's this other feature that we've been working on, where you can embed tail scale as a library inside your go application, and then write a internal admin service that isn't exposed to the internet, but it's only exposed over tailscale.And I haven't seen a way to do those things with those others, but again, I haven't done much research. Um, I understand that zero tier has some layer, two capabilities, but I've, I don't have enough time in the day to look into.[00:25:01] Jeremy: There's been different, I guess you would call them VPN protocols. I mean, there's people have probably worked with IP sec in some situations they may have heard of OpenVPN, wireguard. in the case of tailscale, I believe you chose to build it on top of wireguard.So I wonder if you could talk a little bit about why, you chose wireguard and, and maybe what makes it unique.[00:25:27] Xe: I wasn't on the team that initially wrote like the core of tailscale itself. But from what I understand, wire guard was chosen because, what overhead, uh, it's literally, you just encrypt the packets, you send it to the other server, the other server decrypts them. And you know, you're done. it's also based purely on the public key. Um, the key pairs involved. And from what I understand, like at the wireguard protocol level, there's no reason why you, why you would need an IP address at all in theory, but in practice, you kind of need an IP address because you know, everything sucks. But also wire guard is like UDP only, which I think it at it's like core implementation, which is a step up from like AnyConnect and OpenVPN where they have TCP modes.So you can experience the, uh, glorious, trash fire of TCP in TCP. And from what I understand with wireguard, you don't need to set up a certificate authority or figure out how the heck to revoke certificates. Uh, you just have key pairs and if a node needs to be removed, you delete the key pair and you're done.And I think that really matches up with a lot of the philosophy behind how tailscale networks work a lot better. You know, you have a list of keys and if the network changes the list of keys changes, that's, that's the end of the story.So maybe one of the big selling points was just What has the least amount of things I guess, to deal with, or what's the, the simplest, when you're using a component that you want to put into your own product, you kind of want the least amount of things that could go wrong, I guess.[00:27:14] Xe: Yeah. It's more like simple, but not like limiting. Like, for example, a set of tinker toys is simple in that, you know, you can build things that you don't have to worry too much about the material science, but a set of tinker toys is also limiting because you know, like they're little wooden, dowels and little circles made out of wind that you stick the dowels into, you know, you can only do so much with it.And I think that in comparison, wireguard is simple. You know, there's just key pairs. They're just encryption. And it's simple in it's like overall theory and it's implementation, but it's not limiting. Like you can do pretty much anything you want with it.inherently whenever we build something, that's what we want, but that's a, that's an interesting way of putting it. Yeah.[00:28:05] Xe: Yeah. It. It can be kind of annoyingly hard to figure out how to make things as simple as they need to be, but still allow for complexity to occur. So you don't have to like set up a keyboard macro to write if error not equals nil over and over.[00:28:21] Jeremy: I guess the next thing I'd like to talk a little bit about is. We we've covered it a little bit, but at a high level, I understand that that tailscale uses wireguard, which is the open source, VPN protocol, I guess you could call it. And then there's the client software. You're saying you need to install on each of the servers and workstations.But there's also a, a control plane. and I wonder if you could kind of talk a little bit about I guess at a high level, what are all the different components of, of tailscale?[00:28:54] Xe: There's the agent that you install in your devices. The agent is basically the same between all the devices. It's all written in go, and it turns out that go can actually cross compile fairly well. So you have. Your, you know, your implementation in go, that is basically the, the same code, more or less running on windows, MacOS, freeBSD, Android, ChromeOS, iOS, Linux.I think I just listed all the platforms. I'm not sure, but you have that. And then there's the sort of control plane on tailscale's side, the control plane is basically like control, uh, which is, uh, I think a get smart reference. and that is basically a key dropbox. So, you know, you You authenticate through there. That's where the admin panel's hosted. And that's what tells the different tailscale nodes uh, the keys of all the other machines on the tailnet. And also on tailscale side there's, uh, Derp which is a fleet of a bunch of different VPSs in various clouds, all over the world, both to try to minimize cost and to, uh, have resiliency because if both digital ocean and Vultr go down globally, we probably have bigger problems.[00:30:15] Jeremy: I believe you mentioned that the, the clients were written in go, are the control plane and the relay, the Derp portion. Are those also written in go or are they[00:30:27] Xe: They're all written and go, yeah,go as much as possible. Yeah.It's kind of what happens when you have some ex go team members is the core people involved in tail scale, like. There's a go compiler fork that has some additional patches that go upstream either can't accept, uh, won't accept or hasn't yet accepted, for a while. It was how we did things like trying to shave off by bites from binary size to attempt to fit it into the iOS network extension limit.Because for some reason they only allowed you to have 15 megabytes of Ram for both like your application and working Ram. And it turns out that 15 megabytes of Ram is way more than enough to do something like OpenVPN. But you know, when you have a peer-to-peer VPN engine, it doesn't really work that well.So, you know, that's a lot of interesting engineering challenge.[00:31:28] Jeremy: That was specifically for iOS. So to run it on an iPhone.[00:31:32] Xe: Yeah. Um, and amazingly after the person who did all of the optimization to the linker, trying to get the binary size down as much as possible, like replacing Unicode packages was something that's more coefficient, you know, like basically all but compressing parts of the binary to try to save space. Then the iOS, I think 15 beta dropped and we found out that they increased the network extension Ram limit to 50 megabytes and the look of defeat on that poor person's face. I feel very bad for him.[00:32:09] Jeremy: you got what you wanted, but you're sad about it,[00:32:12] Xe: Yeah.[00:32:14] Jeremy: so that's interesting too. you were using a fork of the go compiler [00:32:19] Xe: Basically everything that is built is built using, uh, the tailscale fork, of the go compiler.[00:32:27] Jeremy: Going forward is the sort of assumption is that's what you'll do, or is it you're, you're hoping you can get this stuff upstreamed and then eventually move off of it.[00:32:36] Xe: I'm pretty sure that, I, I don't know if I can really make a forward looking statement like that, but, I've come to accept the fact that there's a fork of the go compiler. And as a result, it allows a lot more experimentation and a bit more of control, a bit more control over what's going on. like I'm, I'm not like the most happy with it, but I've, I understand why it exists and I'm, I've made my peace with it.[00:33:07] Jeremy: And I suppose it, it helps somewhat that the people who are working on it actually originally worked on the, go compiler at Google. Is that right?[00:33:16] Xe: Oh yeah. If, uh, there weren't ex go team people working on that, then I would definitely feel way less comfortable about it. But I trust that the people that are working on it, know what they're doing at least enough.[00:33:30] Jeremy: I, I feel like, that's, that's kind of the position we put ourselves in with software in general, right? Is like, do we trust our ourselves enough to do this thing we're doing?[00:33:39] Xe: Yeah. And trust is a bitch.[00:33:44] Jeremy: um, I think one of the things that's interesting about tail scale is that it's a product that's kind of it's like network infrastructure, right? It's to connect you to your other devices. And that's a little different than somebody running a software as a service. And so. how do you test something that's like built to support a network and, and how is that different than just making a web app or something like that.[00:34:11] Xe: Um, well, it's a lot more complicated for one, especially when you have to have multiple devices in the mix with multiple different operating systems. And I was working on some integration tests, doing stuff for a while, and it was really complicated. You have to spin up virtual machines, you know, you have to like make sure the virtual machines are attempting to download the version of the tailscale client you wanna test and. It's it's quite a lot in practice.[00:34:42] Jeremy: I mean, do you have a, a lab, you know, with Android phones and iPhones and laptops and all this sort of stuff, and you have some kind of automated test suite to see like, Hey, if these machines are in Ottawa and, my servers in San Francisco, like you're mentioning before that I can get from my iPhone to this server and the data center over here, that kind of thing.[00:35:06] Xe: What's the right way to phrase this without making things look bad. Um, it's a work in progress. It it's, it's really a hard problem to solve, uh, especially when the company is fully remote and, uh, like. Address that's listed on the business records is literally one of the founders condos because you know, the company has no office.So that makes the logistics for a lot of this. Even more fun.[00:35:37] Jeremy: Probably any company that's in an early stage feels the same way where it's like, everything's a work in progress and we're just gonna, we're gonna keep going and we're gonna get there. And as long as everything keeps running, we're good.[00:35:50] Xe: Yeah. I, I don't like thinking about it in that way, because it kind of sounds like pessimistic or defeatist, but at some level it's, it, it really is a work in progress because it's, it's a hard problem and hard problems take a lot of time to solve, especially if you want a solution that you're happy with.[00:36:10] Jeremy: And, and I think it's kind of a unique case too, where it's not like if it goes down, it's like people can't do their job. Right. So it's yeah.[00:36:21] Xe: Actually, if tail scales like control plane goes down, I don't think people would notice until they tried to like boot up a, a reboot, a laptop, or connect a new device to their tailnet. Because once, once all the tailscale agents have all of the information they need from the control plate, you know, they just, they just continue on independently and don't have to care.Derp is also fairly independent of the, like the key dropbox component. And, you know, if that, if that goes down Derp doesn't care at all,[00:37:00] Jeremy: Oh, okay. So if the control plane is down, as long as you had authenticated earlier in the day, you can still, I don't know if it's cached or something, but you can still continue to reach the relay servers, the Derp servers or your, [00:37:15] Xe: other nodes. Yeah. I, I'm pretty sure that in most cases, the control plane could be down for several hours a day and nobody would notice unless they're trying to deal with the admin panel.[00:37:28] Jeremy: Got it. that's a little bit of a relief, I suppose, for, for all of you running it,[00:37:33] Xe: Yeah. Um, it's also kind of hard to sell people on the idea of here is a VPN thing. You don't need to self host it and they're like, what? Why? And yeah, it can be fun.[00:37:49] Jeremy: though, I mean, I feel like anybody who has, self-hosted a VPN, they probably like don't really wanna do it. I don't know. Maybe I'm wrong.[00:38:00] Xe: well, so a lot of the idea of wanting to self host it is, uh, I think it's more of like trying to be self-sufficient and not have to rely on other companies, failures dictating your company's downtime. And, you know, like from some level that's very understandable. And, you know, if, you know, like tail scale were to get bought out and the new owners would, you know, like basically kill the product, they'd still have something that would work for them.I don't know if like such a defeatist attitude is like productive. But it is certainly the opinion that I have received when I have asked people why they wanna self-host. other people, don't want to deal with identity providers or the, like, they wanna just use their, they wanna use their own identity provider.And what was hilarious was there was one, there was one thing where they were like our old VPN server died once and we got locked out of our network. So therefore we wanna, we wanna self-host tailscale in the future so that this won't happen again.And I'm like, buddy, let's, let's just, let's just take a moment and retrace our steps here. CAuse I don't think you mean what you think you mean.[00:39:17] Jeremy: yeah, yeah. [00:39:19] Xe: In general, like I suggest people that, you know, even if they're like way deep into the tailscale, Kool-Aid they still have at least one other method of getting into their servers. Ideally, two. I, I admit that I'm, I come from an SRE style background and I am way more paranoid than most, but it, I usually like having, uh, a backup just in case.[00:39:44] Jeremy: So I, I suppose, on, on that note, let's, let's talk a little bit about your role at tailscale. the title of the archmage of infrastructure is one of the, the coolest titles I've, uh, I've seen. So maybe you can go a little bit into what that entails at, at tailscale.[00:40:02] Xe: I started that title as a joke that kind of stuck, uh, my intent, my initial intent was that every time someone asked, I'd say, I'd have a different, you know, like mystic sounding title, but, uh, archmage of infrastructure kind of stuck. And since then, I've actually been pivoting more into developer relations stuff rather than pure software engineering.And, from the feedback that I've gotten at the various conferences I've spoken at, they like that title, even though it doesn't really fit with developer relations work at all, it it's like it fits because it doesn't. You know, that kind of coney kind of way.[00:40:40] Jeremy: I guess this would go more into the, the infrastructure side, but. What does the, the scale of your infrastructure look like? I mean, I, I think that you touched a little bit on the fact that you have relay servers all over the place and you've got this control plane, but I wonder if you could give people a little bit of perspective of what kind of undertaking this is.[00:41:04] Xe: I am pretty sure at this point we have more developer laptops and the like, than we do production servers. Um, I'm pretty sure that the scale of the production of production servers are in the tens, at most. Um, it turns out that computers are pretty darn and efficient and, uh, you don't really need like a lot of computers to do something amazing.[00:41:27] Jeremy: the part that I guess surprises me a little bit is, is the relay servers, I suppose, because, I would imagine there's a lot of traffic that goes through those. are you finding that just most of the time they just aren't needed and usually you can make a direct connection and that's why you don't need too many of these.[00:41:45] Xe: From what I understand. I don't know if we actually have a way to tell, like what percentage of data is going over the relays versus not. And I think that was an intentional decision, um, that may have been revisited I'm operating based off of like six to 12 month old information right now. But in general, like the only state that the relay servers has is in Ram.And whenever the relay, whenever you disconnect the server, the state is dropped.[00:42:18] Jeremy: Okay.[00:42:19] Xe: and even then that state is like, you know, this key is listening. It is, uh, connected, uh, in case you wanna send packets over here, I guess. it's a bit less bandwidth than you're probably thinking it's not like enough to max it out 24/7, but it is, you know, measurable and there are some, you know, costs associated with it. This is also why it's on digital ocean and vulture and not AWS. but in general, it's a lot less than you'd think. I'm pretty sure that like, if I had to give a baseless assumption, I'd say that probably about like 85% of traffic goes directly.And the remaining is like the few cases in the whole punching engine that we haven't figured out yet. Like Palo Alto fire walls. Oh God. Those things are a nightmare.[00:43:13] Jeremy: I see. So it's most of the traffic actually ends up. Being straight peer to peer. Doesn't have to go through your infrastructure. And, and therefore it's like, you don't need too many machines, uh, to, to make this whole thing work.[00:43:28] Xe: Yeah. it turns out that computers are pretty darn fast and that copying data is something that computers are really good at doing. Um, so if you have, you know, some pretty darn fast computers, basically just sitting there and copying data back and forth all day, like it, you can do a lot with shockingly little.Um, when I first started, I believe that the Derp VMs were using like sometimes as little as one core and 512 megabytes of Ram as like a primary Derp. And, you know, we only noticed when, there were some weird connection issues for people that were only on Derp because there were enough users that the machine had ran out of memory.So we just, you know, upped the, uh, virtual machine size and called it a day. But it's, it's truly remarkable how mu how far you can get with very little[00:44:23] Jeremy: And you mentioned the relay servers, the, the Derp servers were on services like digital ocean and Vultr. I'm assuming because of the, the bandwidth cost, for the control plane, is, is that on AWS or some other big cloud provider?[00:44:39] Xe: it's on AWS. I believe it's in EU central 1.[00:44:44] Jeremy: You're helping people connect from device to device and in a situation like that. what does monitoring look like in, in incidents? Like what are you looking for to determine like, Hey, something's not working.[00:44:59] Xe: there's monitoring with, you know, Prometheus, Grafana, all of that stuff. there are some external probing things. there's also some continuous functional testing for trying to connect to tailscale and like log in as an account. And if that fails like twice in a row, then, you know, something's very wrong and, you know, raise the alarm.But in general. A lot of our monitoring is kind of hard at some level because you know, we're tailscale at a tailscale can't always benefit from tailscale to help operate tail scale because you know, it's tailscale. Um, so it, it still trying to figure out how to detangle the chicken and egg situation.It's really annoying.there's the, the term dog fooding, right? Where they're saying like, oh, we, we run, um, our own development on our own platform or our own software. but I could see when your product is network infrastructure, VPNs, where that could be a little, little dicey.[00:46:06] Xe: Yeah, it is very annoying. But I I'm pretty sure we'll figure something out. It is just a matter of when, another thing that's come up is we've kind of wanted to use tailscale's SSH features, where you specify ACLs in your, you specify ACL rules to allow people to SSH, to other nodes as various users.but if that becomes your main access to production, then you know, like if tailscale is down and you're tailscale, like how do you get in, uh, then there's been various philosophical discussions about this. it's also slightly worse if you use what's called check mode in SSH, where, uh, tail scale, SSH without check mode, you know, you just, it, the, the server checks against the policy rules and the ACL and if it. if it's okay, it lets you in. And if not, it says no, but with check mode, there's also this like eight hour, there's this like eight hour quote unquote lifetime for you to have like sudo mode on GitHub, where you do an auth an auth challenge with your auth aprovider. And then, you know, you're given a, uh, Hey, this person has done this thing type verification.And if that's down and that goes through the control plane, and if the control plane is down and you're tailscale, trying to debug the control plane, and in order to get into the control plane over tailscale, you need to use the, uh, control plane. It, you know, that's like chicken and egg problem level 78,which is a mythical level of chicken egg problem that, uh, has only been foretold in the legends of yore or something.[00:47:52] Jeremy: at that point, it sounds like somebody just needs to, to drive to the data center and plug into the switch.[00:47:59] Xe: I mean, It's not, it's not going to, it probably wouldn't be like, you know, we need to get a person with an angle grinder off of Craigslist type bad. Like it was with the Facebook BGP outage, but it it's definitely a chicken and egg problem in its own right.it makes you do a lot of lateral thinking too, which is also kind of interesting.[00:48:20] Jeremy: When, when you say lateral thinking, I'm just kind of curious, um, if you have an example of what you mean.[00:48:27] Xe: I don't know of any example that isn't NDAed. Um, but basically, you know, tail scale is getting to the, to the point where tailscale is relying on tailscale to make tailscale function and you know, yeah. This is classic oroboros style problem.I've heard a, uh, a wise friend of mine said that that is an ideal problem to have, which sounds weird at face value. But if you're getting to that point, that means that you're successful enough that, you know, you're having that problem, which is in itself a good thing, paradoxically.[00:49:07] Jeremy: better to have that problem than to have nobody care about the product. Right.[00:49:12] Xe: Yeah.[00:49:13] Jeremy: kind of on that, that note, um, you mentioned you worked at, at Salesforce, uh, I believe that was working on Heroku. I wonder if you could talk a little about your experience working at, you know, tailscale, which is kind of more of a, you know, early startup versus, uh, an established company like Salesforce.[00:49:36] Xe: So at the time I was working at Heroku, it definitely didn't feel like I was working at Salesforce for the majority of it. It felt like I was working, you know, at Heroku, like on my resume, I listed as Heroku. When I talked about it to people, I said, I worked at Heroku and that sales force was this, you know, mythical, Ohana thing that I didn't have to deal with unless I absolutely had to.By the end of the time I was working at Heroku, uh, the salesforce, uh, sort of started to creep in and, you know, we moved from tracking issues in GitHub issues. Like we were used to, to using their, oh, what's the polite way to say this, their creation, which is, which was like the moral equivalent of JIRA implemented on top of Salesforce.You had to be behind the VPN for it. And, you know, every ticket had 20 fields and, uh, there were no templates. And in comparison with tail scale, you know, we just use GitHub issues, maybe some like things in notion for doing like longer term tracking or Kanban stuff, but it's nice to not have. you know, all of the pomp and ceremony of filling out 20 fields in a ticket for like two sentences of this thing is obviously wrong and it's causing X to happen.Please fix.[00:51:08] Jeremy: I, I like that, that phrase, the, the creation, that's a very, very diplomatic term.[00:51:14] Xe: I mean, I can think of other ways to describe it, but I'm pretty sure those ways wouldn't be allowed on the podcast. So[00:51:25] Jeremy: Um, but, but yeah, I, I know what you mean for sure where, it, it feels like there's this movement from, Hey, let's just do what we need. Like let's fill in the information that's actually relevant and don't do anything else to a shift to, we need to fill in these 10 fields because that's the thing we do.Yeah.[00:51:48] Xe: Yeah. and in the time I've been working for tail scale, I'm like employee ID 12. And, uh, tail scale has gone from a company where I literally know everyone to just recently to the point where I don't know everyone anymore. And it's a really weird feeling. I've never been in a, like a small stage startup that's gotten to this size before, and I've described some of my feelings to other people who have been there and they're like, yeah, welcome to the club. So I figure a lot of it is normal. from what I understand, though, there's a lot of intentionality to try to prevent tail skill from becoming, you know, like Google style, complexity, organizational complexity, unless that is absolutely necessary to do something.[00:52:36] Jeremy: it's a function of size, right? Like as you have more people, more teams, then more process comes in. that's a really tricky balance to, to grow and still keep that feeling of, I'm just doing the thing, I'm doing the work rather than all this other process stuff.[00:52:57] Xe: Yeah, but it, I've also kind of managed to pigeonhole myself off into a corner with devrel stuff. And that's been nice. I've been working a bunch with, uh, like marketing people and, uh, helping out with support occasionally and doing a, like a godawful amount of writing.[00:53:17] Jeremy: the, the writing, for our audience's benefit, I, I think they should, they should really check out your blog because I think that the way you write your, your articles is very thoughtful in terms of the balance of the actual example code or example scripts and the descriptions and, and some there's a little bit of a narrative sometimes too.So, [00:53:40] Xe: Um, I'm actually more of a prose writer just by like how I naturally write things. And a lot of the style of how I write things is, I will take elements from, uh, the Socratic style of dialogue where, you know, you have the student and the teacher. And, you know, sometimes the student will ask questions that the teacher will answer.And I found that that's a particularly useful way to help model understanding or, you know, like put side concepts off into their own little blurbs or other things like that. I also started doing those conversation things with, uh, furry art, specifically to dunk on a homophobe that was getting very angry at furry art being in, uh, another person's blog.And that's it, it's occasionally fun to go into the, uh, orange website of bad takes and see the comments when people complain about it. oh gosh, the bad takes are hilariously good. Sometimes.[00:54:45] Jeremy: it's good that you have like a, a positive, mindset around that. I know some people can read, uh, that sort of stuff and go, you know, just get really bummed out. [00:54:54] Xe: One of the ways I see it is that a lot of the time algorithms are based on like sheer numbers. So if you like get something that makes people argue in the comments, that number will go up and because there's more comments on it, it makes more people more likely to, to read the article and click on it.So, sometimes I have been known to sprinkle, what's the polite way to say this. I've been known to sprinkle like intentionally kind of things that will, uh, get people and make them want to argue about it in the comments. Purely to make the engagement numbers rise up, which makes more people likely to read the article.And, it's kind of a dirty practice, but you know, it makes more people read the article and more people benefit. So, you know, like it's kind of morally neutral, I guess.[00:55:52] Jeremy: usually that, that seems like, a sketchy thing. But I feel like if it's in service to, uh, like a technical blog post, I mean, why not? Right.[00:56:04] Xe: And a lot of the times I'll usually have the like, uh, kind of bad take, be in a little conversation blurb thing so that people will additionally argue about the characterization of, you know, the imaginary cartoon shark or whatever.[00:56:20] Jeremy: That's good. It's the, uh, it's the Xe Xe universe that they're, they're stepping into.[00:56:27] Xe: I've heard people describe it, uh, lovingly as the xeiaso.net cinematic universe.I've had some ideas on how to expand it in the future with more characters that have more different kind of diverse backgrounds. But, uh, it turns out that writing this stuff is hard. Like actually very hard because you have to get this right.You have to get the right balance of like snark satire, uh, like enlightenment. Andit's, it's surprisingly harder than you'd think. Um, but after a while, I've just sort of managed to like figure out as I'm writing where the side tangents come off and which ones I should keep and which ones I should, uh, prune and which ones can also help, Gain deeper understanding with a little like Socratic dialogue to start with a Mo like an incomplete assumption, like an incomplete picture.And then, you know, a question of, wait, what about this thing? Doesn't that conflict with that? And like, well, yes. technically it does, but realistically we don't have to worry about that as much. So we can think about it just in terms of this bigger model and, uh, that's okay. Like, uh, I mentioned the OSI model earlier, you know, like the seven layer OSI model it's, you know, genuinely overkill for basically everything, except it's a really great conceptual model for figuring out the difference between, you know, like an ethernet cable, an ethernet, like the ethernet card, the IP stack TCP and, you know, TLS or whatever.I have a couple talks that are gonna be up by the time this is published. Uh, one of them is my, uh, rustconf talk on my, or what was it called? I think it was called the surreal horrors of PAM or something where I discussed my experience, trying to bug a PAM module in rust, uh, for work. And, uh, it's the kind of story where, you know, it's bad when you have a break point on dlopen.[00:58:31] Jeremy: That sounds like a nightmare.[00:58:32] Xe: Oh yeah. Like part of the attempting to fix that process involved, going very deep. We're talking like an HTML frame set in the internet archive for sunOS documentation that was written around the time that PAM was used. Like it's things that are bad enough were like everything in the frame set, but the contents had eroded away through bit rot and you know, you're very lucky just to have what you do.[00:59:02] Jeremy: well, I'm, I'm glad it was. It was you and not me. we'll get to, to hear about it and, and not have to go through the, the suffering ourselves.[00:59:11] Xe: yeah. One of the things I've been telling people is that I'm not like a brilliant programmer. Like I know a bunch of people who are definitely way smarter than me, but what I am is determined and, uh, determination is a bit stronger of a force than you'd think.[00:59:27] Jeremy: Yeah. I mean, without it, nothing gets done. Right.[00:59:30] Xe: Yeah.[00:59:31] Jeremy: as we wrap up, is there anything we missed or anything else you wanna mention? [00:59:36] Xe: if you wanna look at my blog, it's on xeiaso.net. That's X, E I a S o.net. Um, that's where I post things. You can see, like the 280 something articles at time of recording. It's probably gonna get to 300 at some point, oh God, it's gonna get to 300 at some point. Um, and yeah, from, I try to post articles about weekly, uh, depending on facts and circumstances, I have a bunch of talks coming up, like one about the hilarious over engineering I did in my blog.And maybe some more. If I get back positive responses from calls for paper submissions,[01:00:21] Jeremy: Very cool. Well, Xe thank you so much for, for coming on software engineering radio.[01:00:27] Xe: Yeah. Thank you for having me. I hope you have a good day and, uh, try out tailscale, uh, note my bias, but I think it's great.

Sep 9, 2022 • 55min

Jonathan Shariat on Tragic Design

Jonathan Shariat is the coauthor of the book Tragic Design and co-host of the Design Review Podcast. He's currently a Sr. Interaction Designer & Accessibility Program Lead at Google. This episode originally aired on Software Engineering Radio. Topics covered: How poor design kills in medical environmentsCausing harm with features meant to bring joyConsiderations during the product development cycleIndustry specific checklists and testing requirementsCreating guiding principles for a teamWhy medical software often has poor UXDesigning for crisis situationsWhy dark patterns can be bad in the long term Related Links @designuxuiTragic DesignHow Bad UX Killed JennyDesign Review podcastDeceptive Design Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: Today I'm talking to Jonathan Shariat, he's the co-author of Tragic design. The host of the design review podcast. And he's currently a senior interaction designer and accessibility program lead at Google. Jonathan, welcome to software engineering radio. [00:00:15] Jonathan: Hi, Jeremy, thank you So much for having me on. [00:00:18] Jeremy: the title of your book is tragic design. And I think that people can take a lot of different meanings from that. So I wonder if you could start by explaining what tragic design means to you. [00:00:33] Jonathan: Hmm. For me, it really started with this story that we have in the beginning of the book. It's also online. Uh, I originally wrote it as a medium article and th that's really what opened my eyes to, Hey, you know, design has, is, is this kind of invisible world all around us that we actually depend on very critically in some cases. And So this story was about a girl, you know, a nameless girl, but we named her Jenny for the story. And in short, she came for treatment of cancer at the hospital, uh, was given the medication and the nurses that were taking care of her were so distracted with the software they were using to chart, make orders, things like that, that they miss the fact that she needed hydration and that she wasn't getting it. And then because of that, she passed away. And I still remember that feeling of just kind of outrage. And, you know, when we hear a lot of news stories, A lot of them are outraging. they, they touch us, but some of them, some of those feelings stay and they stick with you. And for me, that stuck with me, I just couldn't let it go because I think a lot of your listeners will relate to this. Like we get into technology because we really care about the potential of technology. What could it do? What are all the awesome things that could do, but we come at a problem and we think of all the ways it could be solved with technology and here it was doing the exact opposite. It was causing problems. It was causing harm and the design of that, or, you know, the way that was built or whatever it was failing Jenny, it was failing the nurses too, right? Like a lot of times we blame that end user and, and it caused it. So to me, that story was so tragic. Something that deeply saddened me and was regrettable and cut short someone's uh, you know, life and that's the definition of tragic, and there's a lot of other examples with varying degrees of tragic, but, um, you know, as we look at the impact technology has, and then the impact we have in creating those technologies that have such large impacts, we have a responsibility to, to really look into that and make sure we're doing as best of job as we can and avoid those as much as possible. Because the biggest thing I learned in researching all these stories was, Hey, these aren't bad people. These aren't, you know, people who are clueless and making these, you know, terrible mistakes. They're me, they're you, they're they're people. Um, just like you and I, that could make the same mistakes. [00:03:14] Jeremy: I think it's pretty clear to our audience where there was a loss of life, someone, someone died and that's, that's clearly tragic. Right? So I think a lot of things in the healthcare field, if there's a real negative outcome, whether it's death or severe harm, we can clearly see that as tragic. and I, I know in your book you talk about a lot of other types of, I guess negative things that software can cause. So I wonder if you could, explain a little bit about now past the death and the severe injury. What's tragic to you. [00:03:58] Jonathan: Yeah. still in that line of like of injury and death, And, you know, the side that most of us will actually, um, impact, our work day-to-day is also physical harm. Like, creating this software in a car. I think that's a fairly common one, but also, ergonomics, right? Like when we bring it back to something like less impactful, but still like multiplied over the impact of, multiplied over the impact of a product rather, it can be quite, quite big, right? Like if we're designing software in a way that's very repetitive or, you know, everyone's, everyone's got that, that like scroll, thumb, scroll, you know, issue. Right. if, uh, our phones aren't designed well, so there's a lot of ways that it can still physically impact you ergonomically. And that can cause you a lot of problem arthritis and pain, but yeah, there's, there's other, there's other, other ways that are still really impactful. So the other one is by saddening or angry. You know, that emotional harm is very real. And oftentimes sometimes it gets overlooked a little bit because it's, um, you know, physical harm is what is so real to us, but sometimes emotional harm isn't. But, you know, we talk about in the book, the example of Facebook, putting together this great feature, which takes your most liked photo, and, you know, celebrates your whole year by you saying, Hey, look at as a hero, you're in review this, the top photo from the year, they add some great, you know, well done illustrations behind it, of, of balloons and confetti and, people dancing. But some people had a bad year. Some people's most liked engaged photo is because something bad happened and they totally missed. And because of that, people had a really bad time with this where, you know, they lost their child that year. They lost their loved one that year, their house burnt down. Um, something really bad happened to them. And here was Facebook putting that photo of their, of their dead child up with, you know, balloons and confetti and people dancing around it. And that was really hard for people. They didn't want to be reminded of that. And especially in that way, and these emotional harms also come into the, in the play of, on anger. You know, we talk about, well, one, you know, there's, there's a lot of software out there that, that, um, tries to bring up news stories that anger us and which equals engagement. Um, but also ones that, um, use dark patterns to trick us into purchasing and buying and forgetting about that free trial. So they charge us for a yearly subscription and won't refund us. Uh, if you've ever tried to cancel a subscription, you start to see some real their their real colors. Um, so emotional harm and, uh, anger is a, is a big one. We also talk about injustice in the book where there are products that are supposed to be providing justice. Um, and you know, in very real ways like voting or, you know, getting people the help that they need from the government, or, uh, for people to see their loved ones in jail. Um, or, you know, you're getting a ticket unfairly because you couldn't read the sign was you're trying to read the sign and you, and you couldn't understand it. so yeah, we look at a lot of different ways that design and our saw the software that we create can have very real impact on people's lives and in a negative way, if we're not careful. [00:07:25] Jeremy: the impression I get, when you talk about tragic design, it's really about anything that could harm a person, whether physically, emotionally, you know, make them angry, make them sad. And I think the, the most liked photo example is a great one, because like you said, I think the people may be building something that, that harms and they may have no idea that they're doing it. [00:07:53] Jonathan: Exactly like that. I love that story because not, not to just jump on the bandwagon of saying bad things about like Facebook or something. No, I love that story because I can see myself designing the exact same thing, like being a part of that product, you know, building it, you know, looking at the, uh, the, the specifications, the, um, the, the PM, you know, put it that put together and the decks that we had, you know, like I could totally see that happening. And just never, I think, never having the thought, because our we're so focused on like delighting our users and, you know, we have these metrics and these things in mind. So that's why, like, in the book, we really talk about a few different processes that need to be part of. Product development cycle to stop, pause, and think about like, well, what are the, what are the negative aspects here? Like what are the things that could go wrong? What are the, what are the other life experiences that are negative? Um, that could be a part of this and you don't need to be a genius to think of every single thing out there. You know, like in this example, I think just talking about, you know, like, oh, well, some people might've had, you know, if they would have taken probably like, you know, one hour out of their entire project, or maybe even 10 minutes, they might've come up with like, oh, there could be bad thing. Right. But, um, so if you don't have that, that, that moment to pause that moment to just say, okay, we have time to brainstorm together about like how this could go wrong or how, you know, the negative of life could be impacted by this, um, feature that that's all that it takes. It doesn't necessarily mean that you need to do. You know, giant study around the impact, potential impact of this product and all the, all the ways, but really just having a part of your process that takes a moment to think about that will just create a better product and better, product outcomes. You know, if you think about all of life's experiences and Facebook can say, Hey, condolences, and like, you know, and show that thoughtfulness that would be, uh, I would have that have higher engagement that would have higher, uh, satisfaction, right? So they could have created a better outcome by considering these things and obviously avoid the impact negative impact to users and the negative impact to their product. [00:10:12] Jeremy: continuing on with that thought you're a senior interaction designer and you're an accessibility program lead. And so I wonder on the projects that you work on, and maybe you can give us a specific example, but how are you ensuring that you're, you're not running up against these problems where you build something that you think is going to be really great, um, for your users, but in reality ends up being harmful and specifically. [00:10:41] Jonathan: Yeah, one of the best ways is, I mean, it should be part of multiple parts of your cycle. If, if you want something, if you want a specific outcome out of your product development life cycle, um, it needs to be from the very beginning and then a few more times, so that it's not, you know, uh, I think, uh, programmers, uh, will all latch onto this, where they have the worst end of the stick, right? Because a and Q and QA as well. Because, you know, any bad decision or assumption that's happened early on with, you know, the, the business team or, or the PM, you know, gets like multiplied when they talk to the designer and then gets multiplied again, they hand it off. And it's always the engineer who has to, has to put the final foot down, be like, this doesn't make sense. Or I think users are going to react this way, or, you know, this is the implication of that, that assumption. So, um, it's the same thing, you know, in our team, we have it in the very early stage when someone's putting together the idea for the feature, our project, we want to work on it's right there. There's a few, there's like a section about accessibility and a few other sections, uh, talking about like looking out for this negative impact. So right away, we can have a discussion about it when we're talking about like what we should do about this and the D and the different, implications of implementing it. That's the perfect place for it. You know, like maybe, maybe when you're a brainstorm. Uh, about like, what should we should do? Maybe it's not okay there because you're trying to be creative. Right. You're trying to think. But at the very next step, when you're saying, okay, like what would it mean to build this that's exactly where I should start showing up and, you know, the discussion from the team. And it depends also the, the risk involved, right? Like, uh, it depends, which is attached to how much, uh, time and effort and resources you should put towards avoiding that risk it's risk management. So, you know, if you work, um, like my, um, you know, colleagues, uh, or, you know, some of my friends were working in the automotive industry and you're creating a software and you're worried that it might be distracting. There might be a lot more time and effort or the healthcare industry. Um, those were, those are, those might need to take a lot more resources, but if you're a, maybe a building, um, you know, SaaS software for engineers to spin up, you know, they're, um, you know resources. Um, there might be a different amount of resources. It never is zero, uh, because you still have, are dealing with people and you'll impact them. And, you know, maybe, you know, that service goes down and that was a healthcare service that went down because of your, you know, so you really have to think about what the risk is. And then you can map that back to how much time and effort you need to be spending on getting that. Right. And accessibility is one of those things too, where a lot of people think that it takes a lot of effort, a lot of resources to be accessible. And it really isn't. It just, um, it's just like tech debt, you know, if, if you have ignored your tech debt for, you know, five years, and then they're saying, Hey, let's all fix all the tech debt. Yeah. Nobody's going to be on board for that as much. Versus like, if, if addressing that and finding the right level of tech debt that you're okay with and when you address it and how, um, because, and just better practice. That's the same thing with accessibility is like, if you're just building it correctly, as you go, it's, it's very low effort and it just creates a better product, better decisions. Um, and it is totally worth the increased amount of people who can use it and the improved quality for all users. So, um, yeah, it's just kind of like a win-win situation. [00:14:26] Jeremy: one of the things you mentioned was that this should all start. At the very beginning or at least right after you've decided on what kind of product you're going to build, and that's going to make it much easier than if you come in later and try to, make fixes then, I wonder when you're all getting together and you're trying to come up with these scenarios, trying to figure out negative impacts, what kind of accessibility, needs you need to have, who are the people who are involved in that conversation? Like, um, you know, you have a team of 50 people who needs to be in the room from the very beginning to start working this out. [00:15:05] Jonathan: I think it would be the same people who are there for the project planning, like, um, at, on my team, we have our eng counter counterparts there. at least the team lead, if, if, if there's a lot of them, but you know, if they would go to the project kickoff, uh, they should be there. you know, we, we have everybody in their PM, design, engineers, um, our project manager, like anyone who wants to contribute, uh, should really be there because the more minds you have with this the better, and you'll, you'll tease out much, much more of, of of all the potential problems because you have a more, more, um, diverse set of brains and life experiences to draw from. And so you'll, you'll get closer to that 80% mark, uh, that you can just quickly take off a lot of those big items off the table, right? [00:16:00] Jeremy: Is there any kind of formal process you follow or is it more just, people are thinking of ideas, putting them out there and just having a conversation. [00:16:11] Jonathan: Yeah, again, it depends which industry you're in, what the risk is. So I previously worked at a healthcare industry, um, and for us to make sure that we get that right, and how it's going to impact the patients, especially though is cancer care. And they were using our product to get early warnings of adverse effects. Our, system of figuring that like, you know, if that was going to be an issue was more formalized. Um, in, in some cases, uh, like, like actually like healthcare and especially if the, if it's a device or, or in certain software circumstances, it's determined by the FDA to be a certain category, you literally have a, uh, governmental version of this. So the only reason that's there is because it can prevent a lot of harm, right? So, um, that one is enforced, but there's, there's reasons, uh, outside of the FDA to have that exact formalized part of your process. And it can, the size of it should scale depending on what the risk is. So on my team, the risk is, is actually somewhat low. it's really just part of the planning process. We do have moments where we, we, um, when we're, uh, brainstorming like what we should do and how the feature will actually work. Where we talk about like what those risks are and calling out the accessibility issues. And then we address those. And then as we are ready to, um, get ready to ship, we have another, um, formalized part of the process. There will be check if the accessibility has been taken care of and, you know, if everything makes sense as far as, you know, impact to users. So we have those places, but in healthcare, but it was much stronger where we had to, um, make sure that we re we we've tested it. We've, uh, it's robust. It's going to work on, we think it's going to work. Um, we, you know, we do user testing has to pass that user testing, things like that before we're able to ship it, uh, to the end user. [00:18:12] Jeremy: So in healthcare, you said that the FDA actually provides, is it like a checklist of things to follow where you must have done this? As you're testing and you must have verified these, these things that's actually given to you by the government. [00:18:26] Jonathan: That's right. Yeah. It's like a checklist and the testing requirement. Um, and there's also levels there. So, I have, I've only, I've only done the lowest level. I know. There's like, I think like two more levels above that. Um, and again, that's like, because the risk is higher and higher and there's more stricter requirements there where maybe somebody in the FDA needs to review it at some point. And, um, so again, like mapping it back to the risk that your company has is, is really important to understanding that is going to help you avoid and, and build a better product, avoid, you know, the bad impact and build a better product. And, and I think that's one of the things I would like to focus on as well. And I'd like to highlight for your, for your listeners, is that, it's not just about avoiding tragic design because one thing I've discovered since writing the book and sharing it with a lot of people. Is that the exact opposite thing is usually, you know, in a vast majority of the cases ends up being a strategically great thing to pursue for the product and the company. You know, if you think about, that, that example with, with Facebook, okay. You've run into a problem that you want to avoid, but if you actually do a 180 there and you find ways to engage with people, when they're grieving, you find people to, to develop features that help people who are grieving, you've created a value to your users, that you can help build the company off of. Right. Um, cause they were already building a bunch of joy features. Right. Um, you know, and also like user privacy, like I, we see apple doing that really well, where they say, okay, you know, we are going to do our ML on device. We are going to do, you know, let users decide on every permission and things like that. And that, um, is a strategy. We also see that with like something like T-Mobile, when they initially started out, they were like one of the nobody, uh, telecoms in the world. And they said, okay, what are all the unethical bad things that, uh, our competitors are doing? They're charging extra fees, you know, um, they have these weird data caps that are really confusing and don't make any sense their contracts, you get locked into for many years. They just did the exact opposite of that. And that became their business strategy and it, and it worked for them now. They're, they're like the top, uh, company. So, um, I think there's a lot of things like that, where you just look at the exact opposite and, you, one you get to avoid the bad, tragic design, but you also see boom, you see an opportunity that, um, become, become a business strategy. [00:21:03] Jeremy: So, so when you referred to exact opposite, I guess you're, you're looking for the potentially negative outcomes that could happen. there was the Facebook example of, of seeing a photo or being reminded of a really sad event and figuring out can I build a product around, still having that same picture, but recontextualizing it like showing you that picture in a way that's not going to make you sad or upset, but is actually a positive. [00:21:35] Jonathan: Yeah. I mean, I don't know maybe what the solution was, but like one example that comes to mind is some companies. Now, before mother's day, we'll send you an email and say, Hey, this is coming up. Do you want us to send you emails about mother's day? Because for some people that's Can, be very painful. That's that's very thoughtful. Right. And that's a great way to show that you, that you care. Um, but yeah, like, you know, uh, thinking about that Facebook example, like if there's a formalized way to engage with, with grieving, like, I would use Facebook for that. I don't use Facebook very often or almost at all, but you know, if somebody passed away, I would engage right with my, my Facebook account. And I would say, okay, look, there's like, there's this whole formalized, you know, feature around, you know, uh, and, and Facebook understands grieving and Facebook understands like this w this event and may like smooth that process, you know, creates comfort for the community that's value and engagement. that is worthwhile versus artificial engagement. That's for the sake of engagement. and that would create, uh, a better feeling towards Facebook. Uh, I would maybe like then spend more time on Facebook. So it's in their mutual interest to do it the right way. Um, and so it's great to focus on these things to avoid harm, but also to start to see new opportunities for innovation. And we see this a lot already in accessibility where there's so many innovations that have come from just fixing accessibility issues like closed captions. We all use it, on our TVs, in busy crowded spaces, on, you know, videos that have no, um, uh, translation for us in different places. So, SEO is, is the same thing. Like you get a lot of SEO benefit from, you know, describing your images and, and making everything semantic and things like that. And that also helps screen readers. and different innovations have come because somebody wanted to solve an accessibility need. And then the one I love, I think it's the most common one is readability, like contrast and tech size. Sure. There's some people who won't be able to read it at all, but it hurts my eyes to read bad contrast and bad text size. And so it just benefits. Everyone creates a better design. And one of the things that comes up so often when I'm, you know, I'm the accessibility program lead. And so I see a lot of our bugs is so many issues that, that are caught because of our, our audits and our, like our test cases around accessibility that just our bad design and our bad experience for everyone. And so we're able to fix that. And, uh, and it's just like an another driver of innovation and there's, there's, there's a ton of accessibility examples, and I think there's also a ton of these other, you know, ethical examples or, you know, uh, avoiding harm where you just can see it. It's an opportunity area where it's like, oh, let's avoid that. But then if you turn around, you can see that there's a big opportunity to create a business strategy out of it. [00:24:37] Jeremy: Can, can you think of any specific examples where you've seen that? Where somebody, you know, doesn't treat it as something to avoid, but, but actually sees that as an opportunity. [00:24:47] Jonathan: Yeah. I mean, I, I think that the, um, the apple example is a really good one where from the beginning, like they, they saw like, okay, in the market, there's a lot of abuse of information and people don't like that. So they created a business strategy around that And that's become a big differentiator for them. Right. Like they, they have like ML on the device. They do. Um, they have a lot of these permission settings, you know, the Facebook. It was very much focused right. On, on using customer data and a lot of it without really asking their permission. And so once apple said, okay, now all apps need to show what you're tracking. And, and then, um, and asked for permission to do that. A lot of people said no, and that caused about $10 billion of loss for, for Facebook. and for, for apple, it's, you know, they advertise on that now that we're, you know, ethical that, you know, we, we source things ethically and we, we care about user privacy and that's a strong position, right? Uh, I think there's a lot of other examples out there. Like I mentioned accessibility and others, but like it they're kind of overflowing, so it's hard to pick one. [00:25:58] Jeremy: Yeah. And I think what's interesting about that too, is with the example of focusing on user privacy or trying to be more sensitive around, death or things like that, as I think that other people in the industry will, will notice that, and then in their own products, then they may start to incorporate those things as well. [00:26:18] Jonathan: Yeah. Yeah, exactly what the example of with T-Mobile. once that worked really, really well and they just ate up the entire market, all the other companies followed suit, right? Like now, um, having those data caps that, you know, are, are very rare, having those surprise fees are a lot, uh, rare. Um, you know, there's, there's no more like deep contracts that lock you in and et cetera, et cetera. A lot of those have become industry standard now. Um, and so It, and it does improve the environment for everyone because, because now it becomes a competitive advantage that everybody needs to meet. Um, so yeah, I think that's really, really important. So when you're going through your product's life cycle, you might not have the ability to make these big strategic decisions. Like, you know, we want to, you know, not have data caps or whatever, but, you know, if you, if you're on that Facebook level and you run into that issue, you could say, well, look, what could we do to address this? What could we could do to, to help this and make, make that a robust feature? You know, when we talk about, lot of these dating apps, one of the problems was a lot of abuse, where women were being harassed or, you know, after the day didn't go well and you know, things were happening. And so a lot of apps have now dif uh, these dating apps have differentiated themselves and attracted a lot of that market because they deal with that really well. And they have, you know, it's built into the strategy. It's oftentimes like a really good place to start too, because one it's not something we generally think about very, very well, which means your competitors. Haven't thought about it very well, which means it's a great place to, to build products, ideas off of. [00:27:57] Jeremy: Yeah, that's a good point because I think so many applications now are like social media applications, their messaging applications there, their video chat, that sort of thing. I think when those applications were first built, they didn't really think so much about what if someone is, you know, sending hateful messages or sending, pictures that people really don't want to see. Um, people are doing abusive things. It was like, they just assume that, oh, people will be, people will be good to each other and it'll be fine. But, uh, you know, in the last 10 years, pretty much all of the major social media companies have tried to figure out like, okay, um, what do I do if someone is being abusive and, and what's the process for that? And basically they all have to do something now. Um, Um [00:28:47] Jonathan: Yeah. And that's a hard thing to like, if, if that, uh, unethical or that, um, bad design decision is deep within your business strategy and your company's strategy. It's hard to undo that like some companies are still, still have to do that very suddenly and deal with it. Right. Like, uh, I know Uber had a big, big part of them, like, uh, and some other companies, but, uh, we're like almost suddenly, like everything will come to a head and they'll need to deal with it. Or, you know, like, Twitter now try to try to get, be acquired by Elon Musk. Uh, some of those things are coming to light, but, I, what I find really interesting is that these these areas are like really ripe for innovation. So if you're interested in, a startup idea or you're, or you're working in a startup, or, you know, you're about to start one, you know, there's a lot of maybe a lot of people out there who are thinking about side projects right now, this is a great way to differentiate and win that market against other well-established competitors is to say, okay, well, what are they, what are they doing right now that is unethical. And it's like, you know, core to their business strategy and doing that differently is really what will help you, to win that market. And we see that happening all the time, you know, especially the ones that are like these established, uh, leaders in the market. they can't pivot like you can, so being able to say, I'm, we're going to do this ethically. We're going to do this, uh, with, you know, with these tragic design in mind and doing the opposite, that's going to help you to, to find your, your attraction in the market. [00:30:25] Jeremy: Earlier, we were talking about. How in the medical field, there is specific regulation or at least requirements to, to try and avoid this kind of tragic design. Uh, I noticed you also worked for Intuit before. Uh, um, so for financial services, I was wondering if there was anything similar where the government is stepping in and saying like, you need to make sure that, these things happen to avoid, these harmful things that can come up. [00:30:54] Jonathan: Yeah, I don't know. I mean, I didn't work on TurboTax, so I worked on QuickBooks, which is like a accounting software for small businesses. And I was surprised, like we didn't have a lot, like a lot of those robust things, we just relied on user feedback to tell us like, things were not going well. And, you know, and I think we should have, like, I think, I think that that was a missed opportunity, um, to. Show your users that you understand them and you care, and to find those opportunity areas. So we didn't have enough of that. And there was things that we shipped that didn't work correctly right out of the box, which, you know, it happens, but had a negative impact to users. So it's like, okay, well, what do we do about that? How do we fix that? Um, and if the more you formalize that and make it part of your process, the more you get out of it. And actually this is like, this is a good, a good, um, uh, pausing point bit that I think will affect a lot of engineers listening to this. So if you remember in the book, we talk about the Ford Pinto story and there isn't, I want to talk about this story and why I added it to the book. Is that, uh, one, I think this is the thing that engineers deal with the most, um, and, and designers do too, which is that okay. we see the problem, but we don't think it's worth fixing. Okay. Um, so that, that's what I'm going. That's what we're going to dig into here. So it's a, hold on for a second while I explain some, some history about this car. So the Ford Pinto, if you're not familiar is notorious, uh, because it was designed, um, and built and shipped and there, they knowingly had this problem where if it was rear-ended at even like a pretty low speed, it would burst into flames because the gas tank would rupture the, and then oftentimes the, the, the doors would get jammed. And so it became a death trap of fire and caused many deaths, a lot of injuries. And, um, in an interview with the CEO at the time, like almost destroyed Ford like very seriously would have brought the whole company down and during the design of it, uh, and design meaning in the engineering sense. Uh, and the engineering design of it, they say they found this problem and the engineers came up with their best solution. Was this a rubber block. Um, and the cost was, uh, I forget how many dollars let's say it was like $9. let's say $6, but this is again, uh, back then. And also the margin on these cars was very, very, very thin and very important to have the lowest price in the market to win those markets. The customers were very price sensitive, so they, uh, they being like the legal team looked at like some recent, cases where they have the value of life and started to come up with like a here's how many people would sue us and here's how much it would cost to, uh, to, to settle all those. And then here's how much it would cost to add this to all the cars. And it was cheaper for them to just go with the lawsuits and they, they found. Um, and I think why, I think why this is so important is because of the two things that happened afterward, one, they were wrong. it was a lot more people it affected and the lawsuits were for a lot more money. And two after all this was going crazy and it was about to destroy the company, they went back to the drawing board and what did the engineers find? They found a cheaper solution. They were able to rework that, that rubber block and and get it under the margin and be able to hit the mark that they wanted to. And I think that's, there's a lot of focus on the first part because it's so unethical to the value of life and, and, um, and doing that calculation and being like we're willing to have people die, but in some industries, it's really hard to get away with that, but it's also very easy. To get into that. It's very easy to get lulled into this sense of like, oh, we're just going to crunch the numbers and see how many users it affects. And we're okay with that. Um, versus when you have principals and you have kind of a hard line and you, and you care a lot more than you should. And, and you really push yourself to create a more ethical, more, a safer, you know, avoiding, tragic design, then you, there there's a solution out there. Like you actually get to innovation, you actually get to the solving the problem versus when you just rely on, oh, you know, the cost benefit analysis we did is that it's going to take an engineer in a month to fix this and blah blah blah. But if, if you have those values, if you have those principles and you're like, you know what, we're not okay shipping this, then you'll, you'll find that. They're like, okay, there's, there's a cheaper way to, to fix this. There's another way we could address this. And that happens so often. and I know a lot of engineers deal with that. A lot of saying like, oh, you know, this is not worth our time to fix. This is not worth our time to fix. And that's why you need those principles is because oftentimes you don't see it and it's, but it's right there at right outside of the edge of your vision. [00:36:12] Jeremy: Yeah. I mean, with the Pinto example, I'm just picturing, you know, obviously there wasn't JIRA back then, but you can imagine that somebody's having an issue that, Hey, when somebody hits the back of the car, it's going to catch on fire. Um, and, and going like, well, how do I prioritize that? Right? Like, is this a medium ticket? Is this a high ticket? And it's just like, it's just, it just seems insane, right? That you could, make the decision like, oh no, this isn't that big an issue. You know, we can move it down to low priority and, and, and, ship it. Okay. [00:36:45] Jonathan: Yeah. And, and, and that's really what principals do for you, right? Is they help you make the tough decisions. You don't need a principle for an easy one. Uh, and that's why I really encourage people in the book to come together as a team and come up with what are your guiding principles. Um, and that way it's not a discussion point every single time. It's like, Hey, we've agreed that this is something that we, that we're going to care about. This is something that we are going to stop and, fix. Like, one of the things I really like about my team at Google is product excellence is very important to us. and. there are certain things that, uh, we're, you know, we're Okay. with, um, letting slip and fixing at a next iteration. And, you know, obviously we make sure we actually do that. Um, so it's not like we, we, we always address everything, but because it's one of our principles. We care more. We have more, we take on more of those tickets and we take on more of those things and make sure that they ship before, um, can make sure that they're fixed before we ship. And, and it shows like to the end user that th that this company cares and they have quality. Um, so it's one of it. You need a principal to kind of guide you through those difficult things that aren't obvious on a decision to decision basis, but, you know, strategically get you in somewhere important, you know, and, and like, like design debt or, um, our technical debt where it's like, this should be optimized, you know, this chunk of code, like, nah, but you know, in, in it grouping together with a hundred of those decisions. Yeah. It's gonna, it's gonna slow it down every single project from here on out. So that's why you need those principles. [00:38:24] Jeremy: So in the book, uh, there are a few examples of software in healthcare. And when you think about principles, you would think. Generally everybody on the team would be on board that we want to give whatever patient that's involved. We want to give them good care. We want them to be healthy. We don't want them to be harmed. And given that I I'm wondering because you, you interviewed multiple people in the book, you have a few different case studies. Um, why do you think that medical software in particular seems to be, so it seems to have such poor UX or has so many issues. [00:39:08] Jonathan: Yeah, that's a, complicated topic. I would summarize it with a few, maybe three different reasons. Um, one which I think is, uh, maybe a driving factor of, of some of the other ones. Is that the way that the medical, uh, industry works is the person who purchases the software. It's not the end user. So it's not like you have doctors and nurses voting on, on which software to use. Um, and so oftentimes it's, it's more of like a sales deal and then just gets pushed out and they, and they also have to commit to these things like, um, the software is very expensive and, uh, initially with, you know, like in the early days was very much like it needs to be installed, maintain, there has to be training. So there was a lot to money to be made, in those, in that software. And, and so the investment from the hospital was a lot, so they can't just be like, oh, can it be to actually, don't like this one, we're going to switch to the next one. So, because like, once it's sold, it's really easy to just like, keep that customer. There's very little incentive to like really improve it unless you're selling them a new feature. So there's a lot of feature add ons. Because they can charge more for those, but improving the experience and all that kind of stuff. There is less of that. I think also there's just generally a lot less like, uh, understanding of design, in that field. And there's a lot more because there's sort of like traditions of things. they end up putting a lot of the pressure and the, that responsibility on the end individuals. So, you know, you've heard recently of that nurse who made a medication error and she's going to jail for that. And sh you know, And oftentimes we blame that end, that end person. So the, the nurse gets all the blame or the doctor gets all the blame. Well, what about the software, you know, who like made that confusing or, you know, what about the medication that looks exactly like this other medication? Or what about the pump tool that you have to, you know, type everything in very specifically, and the nurses are very busy. They're doing a lot of work. There's a 12 hour shifts. They're dealing with lots of different patients, a lot of changing things for them to have to worry about having to type something a specific way. And yet when those problems happen, what do they do? They don't go in like redesign the devices. Are they more training, more training, more training, more training, and people only can absorb so much training. and so I think that's part of the problem is that like, there's no desire to change. They blame the end, the wrong person, and. Uh, lastly, I think that, um, it is starting to change. I think we're starting to see like the ability for, because of the fact that the government is pushing healthcare records to be more interoperable, meaning like I can take my health records anywhere, that a lot of the power comes in where the data is. And so, um, I'm hoping that, uh, you know, as the government and people and, um, and initiatives push these big companies, like epic to be more open, that things will improve. One is because they'll have to keep up with their competitors and that more competitors will be out there to improve things. Because I, I think that there's, there's the know-how out there, but like, because the there's no incentive to change and, and, and there's no like turnover and systems and there's the blaming of the end user. We're not going to see a change anytime soon. [00:42:35] Jeremy: that's a, that's a good point in terms of like, it, it seems like even though you have all these people who may have good ideas may want to do a startup, uh, if you've got all these hospitals that already locked into this very expensive system, then yeah. Where's, where's the room to kind of get in there in and have that change. [00:42:54] Jonathan: yeah. [00:42:56] Jeremy: Uh, another thing that you talk about in the book is about how, when you're in a crisis situation, the way that a user interacts with something is, is very different. And I wonder if you have any specific examples for software when, when that can happen. [00:43:15] Jonathan: yeah. Designing for crisis is a very important part of every software because, it might be hard for you to imagine being in that situation, but, it, it definitely will still happen so. one example that comes to mind is, uh, you know, let's say you're working on a cloud, um, software, like, uh, AWS or Google cloud. Right. there's definitely use cases and user journeys in your product where somebody would be very panicked. Right. Um, and if you've ever been on an on-call with, with something and it goes south, and it's a big deal, you don't think. Right. Right. Like when we're in crisis, our brains go into a totally different mode of like that fight or flight mode. And we don't think the way we do, it's really hard to read and comprehend very hard. and we might not make this, the right decisions and things like that. So, you know, thinking about that, like maybe your, your let's say, like, going back to that, the cloud software, like let's say you're, you're, you're working on that, like. Are you relying on the user reading a bunch of texts about this button, or is it very clear from the way you've crafted that exact button copy and how big it is? And, and it's where it is relation to a bunch of other content? Like what exactly it does. It's going to shut down the instance where it's gonna, you know, it's, it's gonna, do it at a delay or whatever, like be able to all those little decisions, like are really impactful. And when you, when you run them through the, um, the, the furnace of, of, of, uh, um, a user journey that's relying on, on a really urgent situation, you'll obviously help that. And you'll, you'll start to see problems in your UI that you hadn't noticed before, or, or different problems in the way you're implementing things that you didn't notice before, because you're seeing it from a different way. And that's one of the great things about, um, the, the systems and the book that we talk about around, like, thinking about how things could go wrong, or, you know, thinking about, you know, designing for crisis. Is it makes you think of some new use cases, which makes you think of some new ways to improve your product. You know, that improvement you make to make it so obvious that someone could do it in a crisis would help everyone, even when they're not in a crisis. Um, so that, that's why it's important to, to focus on those things. [00:45:30] Jeremy: And for someone who is working on these products, it's kind of hard to trigger that feeling of crisis. If there isn't actually a crisis happening. So I wonder if you can talk a little bit about how you, you try to design for that when it's not really happening to you. You're just trying to imagine what it would feel like. [00:45:53] Jonathan: yeah. Um, you're never really going to be able to do that. Like, so some of it has to be simulated, One of the ways that we are able to sort of simulate what we call cognitive load. Which is one of the things that happen during a crisis. But what also happened when someone's very distracted, they might be using your product while they're multitasking. We have a bunch of kids, a toddler constantly pulling on their arm and they're trying to get something done in your app. So, one of the ways that has been shown to help, uh, test that is, um, like the foot tapping method. So when you're doing user research, you have the user doing something else, like tapping or like, You know, uh, make it sound like they have a second task that they're doing on the side. It's manageable, like tapping their feet and their, their hands or something. And then they also have to do your task. Um, so like you can like build up what those tabs with those extra things are that they have to do while they're also working on, uh, finishing the task you've given them. and, and that's one way to sort of simulate cognitive load. some of the other things is, is really just, um, you know, listening to users, stories and, and find out, okay, this user was in crisis. Okay, great. Let's talk to them and interview them about that. Uh, if it was fairly recently within like the past six months or something like that. but, but sometimes you don't like, you just have to run through it and do your best. Um, and you know, those black Swan events or those, even if you're able to simulate it yourself, like put your, put your, put yourself into that exact position and be in panic, which, you know, you're not able to, but if you were that still would only be your experience and you wouldn't know all the different ways that people could experience this. So, and there's going to be some point in time where you're gonna need to extrapolate a little bit and, you know, extrapolate from what you know, to be true, but also from user testing and things like that. And, um, and then wait for a real data [00:47:48] Jeremy: You have a chapter in the book on design that angers and there were, there were a lot of examples in there, on, on things that are just annoying or, you know, make you upset while you're using software. I wonder for like our audience, if you could share just like a few of your, your favorites or your ones that really stand out. [00:48:08] Jonathan: My favorite one is Clippy because, um, you know, I remember growing up, uh, you know, writing software, writing, writing documents, and Clippy popping up. And, I was reading an article about it and obviously just like everybody else, I hated it. You know, as a little character, it was fun, but like when you're actually trying to get some work done, it was very annoying. And then I remember, uh, a while later reading this article about how much work the teams put into clubby. Like, I mean, if you think about it now, It had a lot of like, um, so the AI that we're playing with just now, um, around like natural language processing, understanding, like what, what type of thing you're writing and coming up with contextualized responses, like it was pretty advanced for the, uh, very advanced for the time, you know, uh, adding animation triggers to that and all, all that. Um, and they had done a lot of user research. I was like, what you did research in, like you had that reaction. And I love that example because, oh, and also by the way, I love how they, uh, took Clippy out and S and highlighted that as like one of the features of the next version of the office, uh, software. but I love that example again, because I see myself in that and, you know, you ha you have a team doing something technologically amazing doing user research, uh, and putting out a very great product, but he totally missing. And a lot of products do that. A lot of teams do that. And why is that? It's because they're, um, they're not thinking about, uh, they're putting their, they're putting the business needs or the team's needs first and they're putting the user's needs second. And whenever we do that, whenever we put ourselves first, we become a jerk, right? Like if you're in a relationship and you're always putting yourself first, that relationship is not going to last long or it's not going to go very well. And yet we Do that with our relationship with users where we're constantly just like, Hey, well, what is the business? The business wants users to not cancel here so let's make it very difficult for people to cancel. And that's a great way to lose customers. That's a great way to create, this dissonance with your, with your users. And, um, and so if you, if you're, focused on like, this is what the we need to accomplish with the users, and then you work backwards from. You're you're, you're, you're, you're lower your chances of missing it, of getting it wrong of angering your users. and const always think about like, you sometimes have to be very real with yourselves and your team. And I think that's really hard for a lot of teams because we have we don't want to look bad. We don't want to, but what I found is those are the people who actually, um, get promoted. Like, you know, if you look at the managers and directors and stuff, those are the people who can be brutally honest. Right. Um, who can say, like, I don't think this is ready. I don't, I don't think this is good. And so you actually, I, I, you know, I've done that in the front of like our CEO and things like that. And I've always had really good responses from them to say, like, we really appreciate that you, you know, uh, you can call that out and you can just call it like, it is like, Hey, this is what we see this user. Maybe we shouldn't do this at all. Maybe. Um, and that can, uh, you know, at Google that's one of the criteria that we have in our software engineers and the designers of being able to spot things that are, you know, things that we shouldn't should stop doing. Um, and so I think that's really important for the development of, of a senior engineer, uh, to be able to, to know that that's something like, Hey, this project, I would want it to work, but in its current form is not good. And being able to call that out is very important. [00:51:55] Jeremy: Do you have any specific examples where there was something that was like very obvious to you? To the rest of the team or to a lot of other people that wasn't. [00:52:06] Jonathan: um, yeah, so here's an example I finally got, I was early on in my career and I finally got to lead in our whole project. So we are redesigning our business micro-site um, and I got to, I got, uh, assigned two engineers and another designer and I got to lead the whole. I was, I was like, this is my chance. Right? So, and we had a very short timeline as well, and I put together all these designs. And, um, one of the things that we aligned on at the time was like as really cool, uh, so I put together this really cool design for the contact form, where you have like, essentially, I kind of like ad-lib, it looks like a letter. and you know, by the way, give me a little bit of, of, uh, of, of leeway here. Cause this was like 10 years ago, but, uh, it was like a letter and you would say like, you're addressing it to our company. And so it had all the things we wanted to get out of you around like your company size, your team, like, and so our sales team would then reach out to this customer. I designed it and I had shown it to the team and everybody loved it. Like my manager signed off on it. Like all the engineers signed off on it, even though we had a short timeline, they're like, yeah, well we don't care. That's so cool. We're going to build it. But as I put it through that test of like, does this make sense for the, what the user wants answers just kept saying no to me. So I had to go and back in and pitch everybody and argue with them around not doing the cool idea that I wanted to do. And, um, eventually they came around and that form performed once we launched it performed really well. And I think about like, what if users had to go through this really wonky thing? Like this is the whole point of the website is to get this contact form. It should be as easy and as straightforward as possible. So I'm really glad we did that. And I can think of many, many more of those situations where, you know, um, we had to be brutally honest with ourselves with like this isn't where it needs to be, or this isn't what we should be doing. And we can avoid a lot of harm that way too, where it's like, you know, I don't, I don't think this is what we should be building. Right. [00:54:17] Jeremy: So in the case of this form, was it more like you, you had a bunch of drop-downs or S you know, selections where you would say like, okay, these are the types of information that I want to get from the person filling out the form as a company. but you weren't looking so much at, as the person filling out the form, this is going to be really annoying. Was that kind [00:54:38] Jonathan: exactly, exactly. Like, so their experience would have been like, they come up, they come at the end of this page or on like contact us and it's like a letter to our company. And like, we're essentially putting words in their mouth because they're, they're filling out the, letter. Um, and then, yeah, it's like, you know, you have to like read and then understand like what, what that part of this, the, the page was asking you and, you know, versus like a form where you're, you know, it's very easy. Well-known bam. You're, you're you're on this page. So you're interested in, so like, get it, get them in there. So we were able to, to decide against that and that, you know, we, we also had to, um, say no to a few other things, but like we said yes, to some things that were great, like responsive design, um, making sure that our website worked at every single use case, which is not like a hard requirement at the time, but was really important to us and ended up helping us a lot because we had a lot of, you know, business people who are on their phone, on the go, who wanted to, to check in and fill out the form and do a bunch of other stuff and learn about us. So that, that, that sales, uh, micro-site did really well because I think we made the right decisions and all those kinds of areas. And like those, those general, those principles helped us say no to the right things, even though it was a really cool thing, it probably would have looked really great in my portfolio for a while, but it just wasn't the right thing to do for the, the, the goal that we had. [00:56:00] Jeremy: So did it end up being more like just a text box? You know, a contact us fill in. Yeah. [00:56:06] Jonathan: You know, with usability, you know, if someone's familiar with something and it's, it's tired, everybody does it, but that means everybody knows how to use it. So usability constantly has that problem of innovation being less usable. Um, and so sometimes it's worth the trade-off because you want to attract people because of the innovation and they'll bill get over that hump with you because the innovation is interesting. So sometimes it's worth it and sometimes it's not, and you really have to, I'd say most times it's not. Um, and So you have to find like, what is, when is it time to innovate and when is it time to do the what's tried and true. Um, and on a business microsite, I think it's time to do tried and true. [00:56:51] Jeremy: So in your research for the book and all the jobs you've worked previously, are there certain. Mistakes or just UX things that you've noticed that you think that our audience should know about? [00:57:08] Jonathan: I think dark patterns are one of the most common, you know, tragic design mistakes that we see, because again, you're putting the company first and the user second. And you know, if you go to a trash, sorry, if you go to a dark patterns.org, you can see a great list. Um, there's a few other sites that have a nice list of them and actually Vox media did a nice video about, uh, dark patterns as well. So it's gaining a lot of traction, but you know, things like if you try to cancel your search, like Comcast service or your Amazon service, it's very hard. Like I think I wrote this in the book, but. Literally re researched what's the fastest way to delete it to, to, you know, uh, remove your Comcast account. I prepared everything. I did it through chat because that was the fastest way for first, not to mention finding chat by the way was very, very hard for me. Um, so I took me, even though I was like, okay, I have to find I'm going to do it through chat. I'm gonna do all this. It took me a while to find like chat, which I couldn't find it. So once I finally found it from that point to deleting from having them finally delete my account was about an hour. And I knew what to do going in just to say all the things to just have them not bother me. So th that's on purpose they've purposely. Cause it's easier to just say like fine, I'll take the discount thing. You're throwing in my face at the last second. And it's almost become a joke now that like, you know, you have to cancel your Comcast every year, so you can keep the costs down. Um, you know, and Amazon too, like trying to find that, you know, delete my account is like so buried. You know, they do that on purpose and a lot of companies will do things like, you know, make it very easy to sign up for a free trial and, and hide the fact that they're going to charge you for a year high. The fact that they're automatically going to bill you not remind you when it's about to expire so that they can like surprise, get you in to forget about this billing subscription or like, you know, if you've ever gotten Adobe software, um, they are really bad at that. They, they trick you into like getting this like monthly sufficient, but actually you've committed to a year. And if you want to cancel early, we'll charge you like 80% of the year. And, uh, and there's a really hard to contact anybody about it. So, um, it happens quite often. If the more you read into those, um, different things, uh, different patterns, you'll start to see them everywhere. And users are really catching onto a lot of those things and are responding. To those in a very negative way. And like, um, we recently, uh, looked at a case study where, you know, this free trial, um, this company had a free trial and they had like the standard free trial, um, uh, kind of design. And then their test was really just focusing on like, Hey, we're not going to scam you. If I had to summarize that the entire direction of the second one, it was like, you know, cancel any time. Here's exactly how much you'll be charged. And on the, it'll be on this date, uh, at five days before that we'll remind you to cancel and all this stuff, um, that ended up performing about 30% better than the other one. And the reason is that people are now burned by that trick so much so that every time they see a free trial, they're like, forget it. I don't, I don't want to deal with all this trickery. Like, oh, I didn't even care about to try the product versus like. We were not going to trick you. We really want you to actually try the product and, you know, we'll make sure that if you're not wanting to move forward with this, that you have plenty of time and plenty of chances to lead and that people respond to that now. So that's what we talked about earlier in the show of doing the exact opposite. This is another example of that. [01:00:51] Jeremy: Yeah, because I think a lot of people are familiar with, like you said, trying to cancel Comcast or trying to cancel their, their New York times subscription. And they, you know, everybody is just like, they get so mad at the process, but I think they also may be assume that it's a positive for the company, but what you're saying is that maybe, maybe that's actually not in the company's best interest. [01:01:15] Jonathan: Yeah. Oftentimes what we find with these like dark patterns or these unethical decisions is that th they are successful because, um, when you look at the most impactful, like immediate metric, you can look at, it looks like it worked right. Like, um, you know, let's say for that, those free trials, it's like, okay, we implemented like all this trickery and our subscriptions went up. But if you look at like the end, uh, result, um, which is like farther on in the process, it's always a lot harder to track that impact. But we all know, like when we look at each other, like when we, uh, we, we, we talk to each other about these different, um, examples. Like we know it to be true, that we all hate that. And we all hate those companies and we don't want to engage with them. And we don't, sometimes we don't use the products at all. So, um, yeah, it, it, it's, it's one of those things where it actually has like that, very real impact, but harder to track. Um, and so oftentimes that's how these, these patterns become very pervasive is the oh, and page views went up, uh, this was, this was a really, you know, this is high engagement, but it was page views because people were refreshing the page trying to figure out where the heck to go. Right. So um, oftentimes they they're less effective, but they're easier to track [01:02:32] Jeremy: So I think that's, that's a good place to, to wrap things up, but, um, if people want to check out the book or learn more about what you're working on your podcast, where should they head? [01:02:44] Jonathan: Um, yeah, just, uh, check out tragic design.com and our podcast. You can find on any of your podcasting software, just search design review podcast. [01:02:55] Jeremy: Jonathan, thank you so much for joining me on software engineering radio. [01:02:59] Jonathan: alright, thanks Jeremy. Thanks everyone. And, um, hope you had a good time. I did.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner