Software Sessions cover image

Software Sessions

Practical conversations about software development.

Latest episodes

Nov 4, 2020 • 1h 3min

Fixing a Broken Development Process

John Doran is the CTO of Phorest, an application for managing salons and spas.We discuss:- Transitioning a desktop application to a SaaS- Struggling with outages and performance problems- Moving away from relying on a single person for deployment- Building a continuous integration pipeline- Health monitoring for services- The benefits of docker- Using AWS managed services like Aurora and ECSThis episode originally aired on Software Engineering Radio.TranscriptJeremy: [00:00:00] Today I have John Doran with me. John is the director of engineering at Phorest, a Dublin based SAAS company, that processes appointments for the hair and beauty industry. He previously worked as a technical lead at Travelport digital, where he supported major airlines and hotel chains with their mobile platforms.I'll be speaking with John about the early days of their business, the challenges they faced while scaling and how they were able to reshape their processes and team to overcome these challenges. John, welcome to software engineering radio. John: [00:00:29] Hey Jeremy, thanks so much for having me. Jeremy: [00:00:31] The first thing I'd like to discuss is the early days of forest to just give the listeners a little bit of background. What type of product is Phorest? John: [00:00:40] Sure. So forest is essentially, um, It's a salon software focused in the hair and beauty industry. And it didn't actually start off as that back in 2003, it was actually a messaging service actually built by a few students and Trinity college. One of which was his name was Ronan.Percevel. Ronan is actually our current CEO. So that in 2003, that messaging service was supporting, um, nightclubs, dentists, various small businesses around Dublin, and the guys were finding it really hard to get some traction in, in those phase different industries. So Ronan actually went and worked as a, as a hair receptionist in a salon.And what he learned from that was that through using messaging on the platform that they were able to actually increase revenue for salons and actually get more money in the tills, which was hugely powerful thing. So from there, they were able to refocus on the, on that particular industry, they built supplementary features and a product around that messaging service.So in 2004, it became a kind of a fully fledged appointment book. And from there then they, they integrated that appointment book with the messaging service. So by 2006, then I guess you could classify phorest as a full salon software. So it had things like stock take, financial reporting, and staff rostering. That's fully based salon software system was pretty popular in Ireland and actually by between 2006 and 2008, they became the number one in the industry in Ireland. So what that meant was, you know, the majority of salons in Ireland were running on, on the phorest platform and that was actually an on-premise system. So all the data would have been stored locally in the hair salon.And there was no backend, Jeremy: [00:02:30] just so I understand correctly. So you say it was running on premise. It was an appointment system. So is this where somebody would come into the salon and make an appointment and they would enter it into a local computer and it would be just stored there? John: [00:02:46] Exactly. So, so what Ronan figured out throughout his time, working in the salon that by actually sending customers text messages, to remind them about their appointments really helped cut down the no-show rates, meaning that customers did turn up for their appointments when they were due and meaning that the staff members didn't have to sit around, waiting for customers to walk in.So as Phorest I guess, as a company developed. We, we moved into building extra features around that core system, which is an appointment book, which manages the day-to-day, uh, rows of a hairstylist. So we built, uh, email and marketing retention tools around that. Okay. I guess a really important point about Phorest's history is when the recession hit in 2008 in Ireland, we, uh, moved into the UK.So as we were kind of the number one provider in Ireland, we felt that, uh, when the recession hit that we needed to move into the UK, but being on premise meant there was a lot of friction actually installing the system into the salons. So in 2011, they actually took a small seed round to build out, I guess, the cloud backend.Well, once the kind of cloud backend was built, it took about a year to get it off the ground and released. Um, and as the company kind of gained traction in the UK, they, they migrated all of their premise customers onto the cloud solution. Jeremy: [00:04:07] I guess you would say that when it was on premise, a lot of the engineering effort or the support effort was probably in keeping the software, working for your customers and just addressing technical issues or questions and things like that. And that was probably taking a lot of your time. Is that correct? John: [00:04:25] Precisely the, the team was quite small. So we had five engineers who were essentially building it at the cloud backend. And one engineer who was maintaining that Delphi on premise application.So what was happening was our CEO Ronan was actually the product owner at that time. And the guys were making pretty drastic and kind of quickfire decisions in terms of features being added to the product based on, you know, getting a certain customer in that really needed to pay the bills. And some of those decisions, uh, I guess, made the product a bit more complex, uh, as a group, but it certainly was, it was a big improvement from the on-premise solution.Jeremy: [00:05:03] Hmm. So the on-premise solution you said was written in Delphi, is that correct? John: [00:05:08] Yeah, Jeremy: [00:05:09] when it was first started, was it just a single developer? John: [00:05:13] Exactly. Yeah. So it was, it was literally, uh, put together by some, some outsourcers and a single developer managing it. There was no, there was no real in-house developers.It was, you know, a little bit of turnover there. But when that small seed round came in with the guys put that together, the foundations of the cloud-based backend, which was a, a Java kind of classic, uh, n-tiered application with web socket to update the appointment screen. If anything changed on the backend.And, um, you, you would kind of consider it a majestic monolith as such Jeremy: [00:05:44] When you started the cloud solution. Were you maintaining both systems simultaneously? John: [00:05:49] Yeah, so, um, for a full year, that goes where we're building out that backend. And at the same time, there was a one guy who was, who was literally maintaining, fixing bugs on that, that Delphi application.And just to kind of give you an example. Um, one of the guys who was actually working on support, he actually went and taught himself SQL and he used to, to tunnel into the salons at nighttime to fix any database issues. And, um, Jeremy: [00:06:17] Oh, wow.John: [00:06:17] Yeah. So it was, it was, you know, hardcore stuff. Um, another big thing about not being a cloud-based and, and one of the big reasons we needed to become cloud based was we, you know, as, as people move online and, you know, it's, it's quite common to book, you know, your cinema or some something else online, but, um, Ronan could see that trend coming for online bookings, uh, and we needed to be cloud-based to build, to build out that online booking system.And just to kind of give you an idea of the scale, like last year, we, we would have processed about over 2 billion euros worth of transactions to the system. So it's really, it's really growing. And, um, you know, it's huge, huge scale at the moment by that, I guess looking, looking back at the past, the guys would have built a great robust system getting us to that 10,000 salon mark, particularly in the UK, but that would have been the point that the guys would have started seeing some, you know, shakiness in terms of stability and the speed at which at which you could deliver new new features.Jeremy: [00:07:22] You were saying the initial cloud version took about a year to create?John: [00:07:26] Exactly. Yeah. Jeremy: [00:07:27] And you had five engineers working on it after the seed round? At that time, when you first started working on the cloud version of the application, did you have a limited rollout to kind of weed out defects? Or how did you start transferring customers over.John: [00:07:46] So there was definitely some reluctant customers to, to move across. We did it, I guess, uh, gradually there was a lot of reluctance for people. People were quite scared of their data, not being stored in their salon. And so it was quite hard to get those, some of those customers across and only two weeks ago, we, we actually officially stopped supporting that final two customers have finished up. So. You know it took us a good seven years to finish that transition. Jeremy: [00:08:12] Uh, so it was a very gradual transition where you actually, what did you ask customers whether they wanted to move or how did you... John: [00:08:21] Oh, yeah. It was a huge, huge sales and team effort to, to get people across the line. But I would say the majority of people either would have churned or, or would have moved across the, the more forward-thinking people.I, you know, they would have been getting new features and a better service. Jeremy: [00:08:36] Right. So it was kind of more of a marketing push from your side to say, Hey, if you move over to our cloud solution, you'll get all these additional capabilities. But it was ultimately up to them to decide whether they want it to.John: [00:08:47] Yeah. So, um, no, some companies, they. They kind of build that product with a different name and they try and sell it but uh Phorest we actually kept the UI very similar. So it wasn't very intrusive to the users. It was just kind of seen as an upgrade with, um, I guess, less friction. Jeremy: [00:09:06] Right. Right. I want to talk a little bit about the. Early days where you, you said you spent about a year to build the MVP at that point, let's say after that year had passed, were you still able to iterate quickly in terms of features or were you having any problems with performance or downtime at that point?John: [00:09:28] So in 2012, when the cloud-based product launched, particularly in the UK, once we hit about a thousand customers, we started to see creaking issues in the backend, lots of JVM, garbage collection problems, lots of, uh, database contention and lots of outages.So we got to a point where we were trying hardware at the problem to, to make things a little bit faster. So what our problems were. We kind of relied a lot on a single person to do a lot of the deployments. it wasn't really a team effort to, to ship things. It was more so developer finishes code and the machine push it off.Maybe at the end of the month we would ship. I guess the big problem was the stability. So essentially what, what happened was. In terms of the architecture, we were introducing caches at various levels to try and, um, cope with performance. So, uh, a layer of caching on the client's side was introduced, uh, memcached uh, was introduced, uh, level, level 2 hibernate caching, always just, you know, really focusing on fixing the immediate problem, without looking at kind of the bigger picture, once I said, I mentioned that 2000 salons as a marker, I guess once we hit like 1200.The guys had to introduce, uh, the idea of silos, which was like essentially 1000 customers are going to be linked to a specific URL and that URL will host the, the API returning back the data that they need. And then the other silo, which would service the other, you know, 200 growing to say thousand businesses.So essentially if you think about it, you've got, I guess, a big point of failure. If, if that, if that server goes down, There's no load balancing between between servers and those two servers are their biggest size possible. So I guess a big red herring was the, the cost, uh, I guess, implications of that, you know, it was the largest, uh, instance type on Amazon on RDS and EC2 level.Jeremy: [00:11:31] The entire system was on a single instance for each silo?John: [00:11:35] Yeah. So if you imagine, um, when you, when you log in, you'll get returned a URL for a particular silo. So what would happen then would be X businesses go to that silo and X, Y businesses go to the other silo and what that did was basically it load balanced, the businesses at kind of a database level.Jeremy: [00:11:55] You were mentioning how you had like different caching layers, for example, memcached and things like that, but those were all being run on the same instance. Is that correct? John: [00:12:05] Um, they would have been hosted by Amazon. Jeremy: [00:12:07] Oh okay. So those would have been, uh, Amazon's hosted services. John: [00:12:10] So, yeah. Yeah. It's kind of like when you build that MVP or you build that initial stage, your product is kind of, you're focusing on building features.You're focusing on getting bums on seats and you. It was that point, that 12, 1200 to a thousand. salons that where we felt that pain, that scaling pain. Jeremy: [00:12:30] So in a way, like you said, you were doing multitenancy, but it was kind of split per thousand customers. John: [00:12:38] Yeah, exactly. So if you imagine, if, if a failure happened on one of those servers, there is no fault tolerance.If the deployment goes wrong in terms of like, uh, putting an instance in service, those, those thousand customers can't make purchases, their customers can't make online bookings. There's no appointments being served. You can't run transactions through the till. So, uh, we've caused huge, huge friction. Jeremy: [00:13:04] Right. Uh, what were the managed services you were using in combination with the EC2 instance? John: [00:13:12] So, um, a really good decision at the start of the guys moving to cloud was making a big bet on Amazon in terms of utilizing them for RDS, EC2, caching. There was no deployment stack, or there was no deployment, uh, infrastructure as code.It was all I guess, manually done through Amazon console, which is something that we later address, which we'll chat about it, but it was all, all heavily reliant on Amazon. Jeremy: [00:13:38] And you had mentioned that you were relying on one person to do deployment. Was, was that still the case at this time? John: [00:13:46] Yeah, so up until, I guess 2014. Um, It was all reliant on one guy who, who literally had to bring his laptop on holidays with them and tether from cafes. If something went down to deploy new code, he was the only guy who knew how to do it. So it was, it was a huge pain point and bus factor. Jeremy: [00:14:08] So it sounds like in terms of how the team was split up, there was basically, you have people working on development and you have a single person in the sort of ops role.John: [00:14:21] Yeah. And, uh, essentially when, when this kind of thing happens, you, the people who write the code don't ship it, you get all sorts of problems in terms of dependencies and tangles. And, uh, and you know, just just knowledge, knowledge silos, and also, you know, because the guys were working kind of in their own verticals, uh, different areas of the product.There was no consistency. Consistency in terms of the engineering values, how they were working, practices, procedures, you know, deployments, that sort of stuff. It was all, it was all very isolated. So, um, people did their own, their own thing. So, uh, you could imagine for say trying to hire someone new would be quite hard because, um, you know, for someone to come in very, very different, depending on which engineer you talk to.That makes sense? Jeremy: [00:15:10] Yeah, was, was this a locally located team or was this a remote team? John: [00:15:16] Most of the guys were actually in Dublin. Um, one or two people traveled a little bit worked remotely and a couple of people did actually move abroad. So it was predominantly based in Dublin, but, um, some people traveled a bit in terms of processes Jeremy: [00:15:28] For someone knowing how to deploy or how to work on a feature. It was mostly tribal knowledge. It's more just trying to piece together a story from talking to different people. Is that correct? John: [00:15:42] Precisely. So, um, you, you had no consistency in languages or frameworks. Um, except I would say that that model, it, uh, that, that initial part of the platform was extremely consistent, uh, in terms of the patterns used, uh, the. I guess the way it communicated with database and you know, how the API was built, um, was extremely strong and, uh, is, is the heart of steel is the heart of the organization. So say for example, there was a lot of really good, uh, say integration and unit tests there, but they got abandoned for a little while and we had to bring them back, back to life, to, to, to enable us to start moving faster again, and to give us a lot more confidence.Jeremy: [00:16:32] Hmm. So it sounds like maybe the initial version and the first year or so had a pretty solid foundation, but then as. I'm not sure if it was the team that grew or just the, the rate of features. Uh, would you say that? John: [00:16:47] I would say it was a combination of the, the growth of the company in terms of the number of customers on it, and the focus on delivering features.So focusing on feature development rather than tinking about scalability and. Being extremely aware of how, how fast were you gaining customers at that time? Was this a steady increase or large spikes? You're talking to 30% annually. So 30% annually and really, really low churn rate as well. Jeremy: [00:17:17] So what would you feel was the turning point where it felt like your software had, or your business had to fundamentally change due to the number of customers you had?John: [00:17:28] So it was essentially those issues around stability and cost where we're on sustainable for the business customers complaining, uh, our staff not being able to. To do their job. So, you know, part of Phorest's core values and mission is to help the salon owner grow their business and use, use the tools that we provide to, to do that.And if people are firefighting, uh, and not being able to, to support our customers, to be able to send, help them send really great marketing campaigns to boost our revenue, if we're not doing that, um, we're, we're firefighting the company would have been pointless. So we weren't fulfilling our mission by coping with outages and panicking all the time. The costs again, we're, we're unsustainable and you know, the team, you know, it was just, I guess, uncomfortable with this, the state we were in. So the turning point would have been, I would say in like 2014, when we, we essentially hired in some people who have more, more experience in.I would say the high scalability systems and people who, who cared a little bit more about quality and best practices. So when you hire a three or four people like that, you kind of, you bring in a, a different way of thinking you kind of, you hire, hire, hire these dif different values. You know, when you, when you try to. To talk to a team and try and get these things out. They're normally quite organic. If you bring people in from maybe a similar comp at all from a different industry, but similar experience, you, you kind of get that for free. And that's what Phorest did. So, um, basically in 2014, and since now, we've, we've invested heavily in hiring and hiring the right people in terms of how they operate and then in terms of how they think but also bringing that back to our our values and, um, and what we try to do, Jeremy: [00:19:28] Do you think that bringing in, you know, new new people, new talent is really one of the largest factors that allowed you to make such large changes to change your culture and change your way of thinking? John: [00:19:41] The other thing would be, I would say the trust, um, that Ronan CEO and the leadership team Phorest has, um, and their openness to change.Um, I think that, uh, a lot of other organizations will be quite scared of this type of a change in terms of heavily investing in the product to make it better, just like from experience and talking to the people, you know, would have been very easy to, to not invest, uh, you know, and just leave the software ticking along with bugs and handling the downtime, but it was, it was about the organization and their value, their value is around really helping, helping the salon owners and not spending that time firefighting. Jeremy: [00:20:28] So it sounds like within two years or so of, of launch was when you, uh, decided to, to make this change. John: [00:20:38] Yeah. So, um, you know, it's not an, not an easy one to make because you know, it's really hard to find talent.Um, And we, we, we were lucky to, to really get some, some great people in, and it wasn't about making radical change at the start. You know, it started from foundations. So it was teams like, you know, let's get a continuous integration server going here, guys, and, you know, let's bring all of that. Let's bring back all the broken tests and make sure they're running so that we can have a bit more confidence in what we share.W we, you know, introduce code review code reviews and pull requests back in into things and a bit more collaboration and getting rid of those pockets of knowledge. Um, you know, reliance on individuals. Jeremy: [00:21:21] I do want to go more into those a little bit later, but before that, when you were having performance issues or having outages before all these changes, how were you finding out?Was it being reported by users or did you have any kind of process, um, you know, to notify you? John: [00:21:41] So the quite commenting was basically the phones, which would light up. Um, there was very, very little transparency of what was going on in the system. It got to a stage where we actually installed a physical red button on support floor, which, uh, texted everyone in the engineering team.Jeremy: [00:22:00] Oh, wow. Okay. John: [00:22:02] Yeah. Jeremy: [00:22:02] One of the things that we often hear is when a system has issues like this, it's difficult to free up people to fix the underlying problems, um, due to the time investment required. And as you mentioned, all the firefighting going on, how did you overcome this issue? John: [00:22:24] So I guess. You know, the beforehand it was, it was a matter of, you know, restart the server.Let's keep going with our features, but it was really bad stopping to think about, um, you know, what really happened here. And, you know, maybe let's write down an incident report and gather some data, but, well, what actually happened under the hood and a few things, you know, a few questions, key questions could be raised from that.You know, what are we going to do to stop this from happening again? Why didn't we know about it before the customers and, you know, What were the steps we made to, to reproduce some, actually fix this issue and, and what are the actions that are going to happen and how are we going to track that those actions do happen after, after the issue?Jeremy: [00:23:08] Let me see you. If I understand this correctly, you, you actually did build sort of a process when you would have incidents to figure out okay. John: [00:23:17] That was the first step I would say. Yeah. So let's figure out what happened and how, and it was just about gathering data and getting information about what was, what was really going on.So let us identify as, you know, common things that happens that may be usually we would just, you know, restart server and forget but or fail over database and forgotten, you know, everything's not normal and a couple of errors, but as we started gathering that data, we started to see common problems. So maybe, you know, Our deployment processes isn't good enough and it's error prone, or this specific message broker isn't fault tolerant, or the IOPS in the database are too high at this time due to these queries happening.But after we got that data, you know, uh, and we started really digging deep into the system. We realized that this isn't something that you could just take two days in your sprint to start to fix, uh, go, just coming back to your question on, uh, finding that time to, to fix things where we kind of had to make a tough call.When we looked at everything to, to say, you know, let's stop feature work and let's stop product work. And. Let's fix this property. Jeremy: [00:24:26] Okay. Yeah. So, so basically you got more information on, on why you were having these problems, why you were having downtime or performance issues and started to build kind of a picture of, and realize that, Oh, this is, this is actually a very large problem.And then as a company, you made the decision that, okay, we're going to stop feature development. To make sure we have enough resources to really tackle this problem. John: [00:24:55] Precisely. And, um, from the product side of things, you know, this was a big, big driving factor in it. You know, we wanted to build all these amazing features to help salons to grow, but we just couldn't actually deliver on them.And we couldn't have any predictability on the way we deliver them, because because of that firefighting and, you know, cause we were sidetracked so much. There was no confidence in, in release cycle and stability or, or what, what we could actually deliver. So, um, yeah, it was, it was a pretty hard decision for us to make in terms of, uh, the business.Cause we haven't had a lot of deliverables and commitments to customers and to, you know, to our sales team. So we, we have to have to make that call. Jeremy: [00:25:36] You were mentioning earlier about how you started to bring in a continuous integration process before you had done that. You also mentioned that there were tests in the system initially, but then the tests kind of went away.Could you kind of elaborate on what you meant by that? John: [00:25:53] Yeah, so. As I said, like the kind of the core system was built with a lot of integrity and a lot of good patterns. For example, uh, a lot of integration tests, uh, running against, uh, the APIs, uh, were written and maybe were written against a specific feature, but they were never run as a full suite.So, um, what would happen was there'd be maybe one or two flaky ones. And, um, you know, because there was no continuous integration server, it was, it was easy enough for a developer to run specific tests for that, uh, functionality that they were were were building. But because there was the CI wasn't there, there was no full suite ran.So when, when it came time to actually do that, we realized, you know, So, you know, 70% of them are broken, Jeremy: [00:26:40] so they, they were building tests as they developed, but then they were maybe not running those, uh, John: [00:26:47] before commit or mergeJeremy: [00:26:49] right. And so adding the continuous integration process, having, uh, some kind of build process, really forced people to pay attention to whether those tests were working or not John: [00:27:01] Exactly. Um, and. Just a kind of a step on from that was, you know, um, a huge delay in getting stuff to test because, because we relied on that onw guy to build stuff. Um, and actually that was, you know, done from a, you know, a little Linux box in the engineering floor, um, which was quite temperamental.Uh, you'd be quite delayed and actually in even just getting stuff into people's hands and kind of what the core of software development is all about right, you know, Getting getting what you build into people's hands and we just couldn't do it Jeremy: [00:27:34] Just because the, the process of actually doing a build and a deployment was so difficult when you added the continuous integration process.Uh, were there other benefits that you maybe didn't expect when you started bringing this in?John: [00:27:50] So, um, I, I guess I mentioned the deployments is a big one. I think that. People started to see real, um, benefit in terms of their workflow. I guess, along with the continuous integration, there was, uh, more, more discipline in terms of, uh, how we worked.So the CI server introduced a better workflow, uh, for us on a, it helped us see real clarity, uh, in terms of the quality of the system, where, where we had coverage, where it didn't and, um, It also helped us break up the system a little bit. So I mentioned majestic monoliths. So it was actually when, when we went to look at it, there was five application servers sitting in one repo and the CI server and some crafty engineering helped us split that up quite well.To break at the repo into multiple application servers. Jeremy: [00:28:45] Hmm. So, so actually bringing in the continuous integration actually encouraged you to rearchitect your application and in certain ways, and break it down into smaller pieces. John: [00:28:56] Exactly. Yeah. And really it was all about confidence. Um, and being able to test and then know that we weren't progressing.Jeremy: [00:29:03] What do you, you think people saw in terms of the pain or the challenges from that sort of monolith set up that you think sort of inspired them to break it up? John: [00:29:13] The big one was a bug fix in one small, area of the system meant the whole stack had to be redeployed, which took hours and hours. The other thing would have been the speed of development in terms of navigating around a pretty large code base and the slowness of the test suite to run, which was around 25 minutes.When we, when we started and got them all green, Jeremy: [00:29:36] the pain of running the tests and having it possible to break so many things with just one change, maybe encourage people to, to shrink things down. So they wouldn't have to think so much about the whole picture all the time. John: [00:29:50] Exactly. We started to see, you know, a small fix or a small feature breaking something completely non-related typical example would have been due to a HTTP connection configuration on a client, um, breaking completely on, unrelated areas of the system. Jeremy: [00:30:06] Okay. One thing I'd like to talk about next is the monitoring. Uh, you mentioned earlier that it was really just phone calls would come into support and you even had the, the big red button, you could press uh, what did you do to add monitoring to your application?John: [00:30:25] That's pretty, uh, important to mention that, you know, we talked about making a decision to stop to down tills (?) and start fixing stuff. So that's, that's when, uh, we, we started, you know, looking at the monitoring and everything else, like continuous integration, bringing back tests, but at kind of a key point of, uh, of this evolutionary project was, was the monitoring.Um, so we did a few things. So we, we upgraded our systems to be using new relic. To help us find errors and it was there, but, um, it wasn't being utilized in a good enough way. So we used the APM (Application Performance Management) there. We looked at CloudWatch I mean we introduced watch metrics to help us watch traffic, to help us see slow, uh, transactions.Um, log entries helped us a lot. Uh, in terms of spotting anomalies in the logs, Pingdom was actually a, a really surprising, um, good addition to the monitoring. Um, it's simply just, just calls any health check, endpoint you want. And. That has some, some nice Slack and messaging integration. It was, that was great for us.It's helped us a lot. So we did a couple of other things like, um, some small end to end tests that would, um, Give us a kind of a heartbeat to how the system was running. Um, and, and they were also gave us the kind of confidence that we would know about an issue before a customer, being able to allow us to get rid of that red button.Jeremy: [00:31:53] All of these are managed services that, that you either send logs to or check health end points on your system. Did you configure them somehow, too text your team or send messages to your team when certain conditions were met or John: [00:32:11] so we, we, we started with just like a simple Slack channel that would, uh, would send us any kind of dev ops related issues into, into there.And that that's kind of what helped us change the culture a little bit in terms of being more aware of the infrastructure and the operations. And Pingdom was great for set, setting up a team with people who, who would get notifications for various parts of the system. And, uh, our CloudWatch, um, alarms, we set up a little Lambda function that would just forward on uh any critical messages to text us. Jeremy: [00:32:44] And before this, you said it was basically just user calls and now you are actually shifting to kind of proactively identifying the problems. Yeah. John: [00:32:54] Yeah, exactly. There was some small, really small alerts there, but nothing as comprehensive as this. We actually, um, we changed some of the applications. We introduced health end points to all of them. So they would report on their ability to connect to message broker, their ability to connect to a database, any dependencies that they actually needed. We would actually check as part of pinging that endpoint. So, if you hit any of our servers, we, any new or older ones, they would all have like a forward slash health endpoint and that would give you back a JSON structure, uh, and give us a good insight into how healthy that component was.Jeremy: [00:33:33] Yeah. And if there was a problem and you were trying to debug issues, were you mostly able to log into these third-party services, like log entries or new Relic to actually track down the problem? John: [00:33:46] Yeah. So again, th those services gave us that information, but it would always come back to, you know, being, if you needed to get into a server and a big thing, which we'll talk about is Docker.Um, we, we don't have SSH access into those servers. So we rely on those third parties to give us that information. But in the past, maybe we would have had to get in and, you know, look at the processes and take dumps, but. With log entries and new Relic, we were able to do that stuff without needing to. Jeremy: [00:34:17] So previously you might have someone actually SSH into the individual boxes and look at log files and things like that.John: [00:34:25] Exactly. So it's quite easy when you've got one server, but with that, as we'll discuss where you've got many small containers and it's extremely complicated Jeremy: [00:34:34] Next I'd like to talk about, uh, since you mentioned Docker, uh, how did you make the decision to move to Docker? John: [00:34:42] So it was something our CTO was really aware of and he really wanted us to explore this.The big benefits for us was that shift in mindset of one guy not being responsible for deployments, but us actually developing and using Docker in our day to day workflow and the cost implications as well. The fact that we could, instead of having that say eight X large, we could have. Running one application server.We could have 12 containers running on much smaller containers running on an EC2 instances. So it was that idea of being able to, to maximize, uh, CPU and memory. What was a huge, huge benefit for us that we, we, we saw. Jeremy: [00:35:24] So the, the primary driver was, was almost your. AWS bill or your John: [00:35:30] Big time. Yeah. Portable applications that, um, you know, w had much less maintenance. We didn't have to go in and worry about it because we had a, I guess, a. We mentioned this earlier, like these kinds of silo tech stacks, we didn't need to worry about a Ruby environment or PHP environment or a Java JVM install. It was just the container.And that was a hugely big, an important thing for us to do and really kind of well thought out by our CTO at the time. Jeremy: [00:35:59] So, so you mentioned like Ruby containers and JVMs and things like that. Does your application. I actually have a bunch of different frameworks and a bunch of different languages?John: [00:36:09] Yeah. So, um, as we split out the, that monolith, uh, we also, I guess, started building smaller domain specific, not micro I'd say kind of uh, services responsible for areas of the system, uh, our online booking stack. So if you go to any of our customers, um, you know, you can book a and their point of sale system in the salon, but you can also book on your phone and we have a custom domain for every one of those salon. So it's like phorest.com/book/foundationhair.Um, if you click on that, you're going to be brought to the online booking stack, which is a, a rails app actually in a, an Ember, Ember JS frontend. So, um, the system, as we started splitting it apart became more and more distributed and Docker was great for us in terms of consistency. And that portability particularly around different text stacks. Jeremy: [00:37:01] Migrating to Docker, made it easier for you to both develop and to deploy using a bunch of different tech stacks.John: [00:37:08] Exactly. Jeremy: [00:37:09] When running through your CI pipeline, would you configure it to create an image that you would put into a private registry such as Amazon's elastic container registry? John: [00:37:18] Yeah. So we made the mistake of building and hosting our own registry at the start. Uh, we quickly realized the pain and that around three, four months in where I'm actually at the same time as Amazon released the ECR.So I guess the main reason we did that ourselves was because we were early adopters and we pay, paid a little tax on that, but we did, uh, we moved to ECR. So. Our typical application kind of pipeline is uh build unit tests, maybe integration, acceptance tests, build a container. And then some of those applications, they run acceptance tests against the container running on the CI server, uh, push, push to the registry.And after it's pushed to the registry, then we would configure deployment and trigger it. Jeremy: [00:38:02] Do you have a manual process where somebody watches the pipeline go through and you make the call to push or not? Or is it a automated process? John: [00:38:13] Automated. So, um, we built a small kind of deployment framework again, because we were early adopters of Amazon's ECS, uh, their container service.Uh, so we built a small, um, deployment stack, which which allowed us to, to essentially configure a new services in ECS and deploy new versions of our application through CI to our, uh, ECS clusters. So it was all automated using an infrastructure as code solution, such as cloud formation. So when we were in looking back at the problems in the old good old days, uh, you know, we seen that one was, you know, uh, things were just configured on the AWS console.And we, we knew we needed infrastructure as code and we needed a repeatability and the ability to recreate stuff. So we, we use cloudformation and essentially what face something very similar to terraform. Um, and we do use Terraform for, for some of our managing our restaurant clusters on some other things.Jeremy: [00:39:16] Okay. So you maybe initially moved from having someone manually going in and creating virtual machines to more of a infrastructure is code approach. John: [00:39:27] Exactly. Yeah. Jeremy: [00:39:28] You, you had mentioned that one of the primary drivers of, of using Docker was performance, did you start creating performance metrics so that you knew how much progress you were making on that front?John: [00:39:41] Yeah, so essentially that the effort to kind of make our infrastructure more reliable it's it was a set as kind of a set of steps to get there. And we started with API level testing to make sure that anything we change under the hood, it didn't break the functionality. And we also wrote a bunch of performance tests, particularly around pulling down appointments, creating appointments and sending large, large volumes of messages.We, we knew we couldn't have any regressions there. So. We use gattling to do those performance tests. And we, we would run that from continuous integration server and we do various types of soak testing to make sure we weren't weren't taking any steps backwards. Jeremy: [00:40:23] So each time you would do a deployment, you would run performance tests to ensure that you weren't getting slower, or you weren't having any problems from the new deployment.John: [00:40:34] Yeah, I would say though, that like, uh, this kind of effort and we called it project Darwin internally, this effort to kind of. It had a few goals, but it was all about, you know, becoming fault-tolerant being more scalable, reducing Amazon costs. And during project Darwin, when we, we didn't just move our 1200-1500 salons we didn't just drop them and move them to docker.There was so many changes under the hood that these performance tests were key to giving us, uh, a pulse on how we were doing. But, um, I guess when we were done with project Darwin and every, everything was onDocker and, and everyone was much, much happier. Um, we, we just, we run those performance tests, ad hoc and as, as part of some release pipeline.Jeremy: [00:41:21] Hmm. Okay. So initially you were undergoing a lot of big changes and that was where it was really important to, uh, check the performance metrics with, with every build. John: [00:41:32] Exactly. Yeah. Jeremy: [00:41:33] Uh, what were some of the, the big challenges? Cause you mentioned you were changing a lot of things. What were some of the big challenges moving your existing architecture to Docker and to ECS?John: [00:41:47] There was a couple. The biggest there's two huge ones. So one was state. Getting state out of those big servers was extremely hard. We needed to re remove the level two cache because we need, because we needed to turn that one server into smaller, load balanced containers. We needed to remove the state because we didn't want somebody in one term computer terminal, fetching our appointments, and then on their first go mobile app looking at different data.So we had to get rid of state. And the challenge there was that MySQL performance just wasn't good enough for us. So, um, we actually had to look really hard at migrating to Amazon Aurora, which is what we did again, coming back to cost. Uh, Aurora is much more cost-effective in terms of the system beforehand was provisioned for peak load.So we, we would have provisioned IOPS for Friday afternoon. The busiest time that the salon was using the system. And we were paying for the same amount on a Sunday night. Compared to Aurora where you're, you're paying for IOPS and the additional benefits of performance around how Amazon rebuilt the storage engine there.So that's the caching side of things. The other big, big challenge was the VPC. So when you needed to get all of our applications in, into a VPC to to be able to use the latest instance types on Amazon, uh, and also for our application to be able to talk security to Aurora database. So those two are definitely the biggest challenges with the MySQL setup.Jeremy: [00:43:19] It sounded like you had to pay for your peak usage, whereas with Aurora it automatically scales up and down. Is that correct? John: [00:43:29] Um, no. You're actually charged per read and write. So that would be the big difference.Jeremy: [00:43:34] Oh I see. Per read and write. Okay. So it's just per operation. So you don't actually think about what your needs are going to be.It kind of just charges you based on how it's used. John: [00:43:45] The other new really nice thing was, uh, looking back at our incident reports, a really common issue would have been, Hey, the database has run out of storage and Aurora does actually autoscale its storage engine. Jeremy: [00:43:56] You mentioned removing state from your servers and you, you mentioned removing the level two cache, can you kind of explain sort of at a high level, what that means to those who don't understand that? John: [00:44:10] Sure. So in the Java world, when you have an ORM framework like hibernate, essentially when you create a database, that cache will store that data, um, in its level two cache. And what that means is that it doesn't need to hit the database for every query.And that's the, that was the solution for, for Phorest as we were in that MVP slash early days. But it wasn't the solution for us to become fault-tolerant. Jeremy: [00:44:39] So it would be someone makes a query to an ORM in this case. It's and it's hibernate and, uh, on the server's memory, it would retrieve the results from there instead of from the database.John: [00:44:55] Yeah, exactly. Okay. And that's what I was coming back to around, um, creating an API for a list of appointments. If you had two servers deployed with, with them using an L2 cache, you would get different state because cache, Jeremy: [00:45:12] you put a different cache in place, or did you remove the cache entirely? John: [00:45:17] So we removed that cache entirely, but we did have a rest cash, which is, was memcached and that's distributed. And we use cache keys based on, uh, entity versions. So that was distributed and, and worked well with multiple containers behind a load balancer. Jeremy: [00:45:34] So you removed the, the cache from the individual servers, but you do have a managed Memcached instance that you use. John: [00:45:42] Yeah, exactly. And getting rid of that level two cache.Our performance tests told us that MySQL just wasn't performance enough. Whereas Aurora was much better at handling those types of queries, some large joins. It was a big, big relational database. Jeremy: [00:45:58] So we we've talked about adding in continuous integration, monitoring performance metrics, uh, Aurora Docker. Did any of these processes require large changes to your code base? John: [00:46:13] To be honest, not really. It was more of a plumbing things together and a lot of orchestration from a human point of view. So, um, people being aware of how all this stuff works and. Uh, essentially making sure that we all knew we're on the right page.I don't like the biggest, uh, piece of engineering and coding work was the deployment and infrastructure script. So provisioning the VPCs writing the integrations with ECS, uh, that, that sort of thing. But, um, in terms of actual coding, it wasn't too invasive. Jeremy: [00:46:47] I think that's actually. A positive for a lot of people, because I believe there are people who think they need to do a big, uh, rewrite if they have, you know, performance problems or problems, keeping track of the status of their system.But I think this is a good case study that shows that you don't necessarily need to do a rewrite. You just need to put additional processes and checks in place and, um, maybe change your. Deployment process to kind of get the results that you weren't John: [00:47:21] It's about the foundations as well. If you have some really strong people at the start who, you know, pave some good uh, roads there in terms of good practices, like just for example, a really good, uh, Database change management set up some good packaging of the code.Really good packaging of the code. So it was quite easy for us to slip out five services from that big monolith. It's about the foundations at the start, because it would be quite easy to, to build an MVP with some people who raised, you know, 1000 line PHP scripts and the product works and that's a different case study because, you know, you CA you can't fix that essentially.Jeremy: [00:48:04] Right? So it's because the original foundation was strong that you were able to undergo this sort of transformation John: [00:48:12] truly yeahJeremy: [00:48:13] adopting all of these processes, did they resolve all of the key problems your business faced? John: [00:48:21] When we, when we look back and we see that, you know, all of our systems are running on docker, we see a huge cost benefit.So uh that problem was certainly solved. We, we were able to see issues before our customers, so we have better transparency, uh, in the system. No longer was, uh, were we dependent on one big server uh, a 1000 customers were no longer dependent on one big server. Um, so it meant that we had really good fault and we do have really good fault tolerance on, on those containers.If one of them dies, ECS will literally kill it. Uh, bring up a new one. Uh, it will also do some auto scaling for us. Say on a Monday morning, it will, you maybe have eight containers running, but on a Friday, maybe it'll auto scale to 14. So that's been ground bre breaking for us in terms of how we work. We went from shipping monthly to quarterly from between monthly and quarterly to, to daily.And something I use as a, uh, a team health metric right now is, is our frequency of deployment. And I'd say we're hitting about 25 deployments a week now, compared to the olden days is, is, is great. We always want to get better at it. I would say that those have been really amazing things for us, but also in terms of the team, it's, it's a lot easier for us now to hire a new engineer, um, bringing them in because of this consistency.And also, um, I guess, uh, we're not relying on these pockets of knowledge anymore. So we, again, around hiring it's, it's a lot easier for someone to come into the system and, and know how things work. And I think in terms of hiring as well, when you talk about the kind of setup it's, uh, it's, you know, you know, there's some, some good stuff happening there.Jeremy: [00:50:10] It sounds like you have a better picture in terms of monitoring the system, you brought your costs down significantly. The deployment process is much easier. The existence of the containers and ECS is kind of serving the purpose of where people used to have to monitor the individual servers and bring them up themselves.But now you've sort of outsource that to Amazon, to take care of. Uh, does that sound, does that all sound correct? John: [00:50:42] Yeah. Spot on. Jeremy: [00:50:43] And I find it interesting too, that you had mentioned that improving all of your process to use actually made it easier to bring new people in. And that's because you were saying things are now more clearly defined in terms of what to do, rather than all of this information kind of being tribal in a sense.John: [00:51:06] Yeah. Like a typical example will be like, Hey, uh, let's redeploy this, uh, bug fix. And so previously, you know, it might be a capistrano deploy or, uh, you know, oh you need to get SSH keys to this thing and, you need to log in here and you need to build this code. On this local machine and try and ship it up.And that just all goes away. Um, particularly with Docker on that, that continuous integration pipeline is just, it sets a really good set of standards and things that people should find quite easy and repeatable. Jeremy: [00:51:40] And, uh, so now in terms of deployment, you can use something like cloud formation and you have the continuous integration process that can deploy your software without somebody having to know specifically about how that part works.John: [00:51:58] Exactly. So I would say if we wanted to create like a new service responsible for some new, new functionality in Phorest, uh, say a spring boot application, a Java application. They can simply provide a Docker file and get that deployed to dev staging or production with, I would say 10 lines of YAML configuration.So you could go from initial set up of a project to, to production in a day. If you wanted to just. Zero friction there I would say. Jeremy: [00:52:29] It really makes the onboarding a lot easier then do you think your team waited too long to change their processes? Or do you think these changes came at just the right time? John: [00:52:42] I would say if we waited any longer, it could have been detrimental to, to, I guess, the health of the business.I think that the guys did a great job in terms of getting us to a certain point. Well, we would have risked technical decay, I would say. And, uh, what kind of, uh, really, uh, harming the organization. If I had gone any further, I would say it was, it was a lot of work to do this and it could have been easier if.If we had paid more attention to technical debt and making the right decisions earlier on. So maybe saying no to that customer who wants a bespoke piece of functionality, well, you have to do what you have to do. Jeremy: [00:53:24] So, so you would say maybe identifying earlier on just how much the current processes were causing you problems.If you had identified that earlier, um, you think you might have made the decision to try and make these changes, uh, at an earlier time. John: [00:53:44] Yeah. So the guys earlier were, were making really good decisions, but maybe they didn't have the experience for, you know, higher scale at the scalability solutions and systems.So it's, it's about hiring the right people at different stages of where the product is evolving. I would say. Jeremy: [00:54:00] Given what you've gone through with the migration of Phorest, what advice would you give to someone building a new process? What, what can they do to keep ahead of either technical debt or any of the other issues you face?John: [00:54:18] I think it's about how it's, it's actually, uh, a people, um, and cultural thing along with tech decisions. So. Everybody needs to be really aligned in terms of these decisions that they're making, rather than letting people go on an individual basis. I think there needs to be good leadership in terms of getting a group of people thinking the same way.I reckon the technical currency is, is extremely important. And as your system grows, you need to be able to, to look, look back. And identify areas of pain and by pain, I mean, you know, speed of deployment, uh, speed of development, ability to adapt and change your software. So if you notice that a feature that used to maybe take a week has now taken two weeks.You know, you probably need to take a really hard look at that area of the system and figure it out. Could it be simplified? Um, and why, why is it taking too long? Jeremy: [00:55:21] Basically identifying exactly where your pain points are, um, so that you can really focus your efforts and, and have an idea of what you're really going for.John: [00:55:31] Yeah. You need to build, um, an environment of trust. And I will also say that you need to be able to.To be able to be calm, confident, and okay with failure in terms of take taking risks sometimes and saying no to features and customers to be able to, to push back on, on leadership and make sure that you're, you're really evolving the system the right way.Uh, not just, uh, becoming a feature factory. Jeremy: [00:56:01] Yeah. It's always going to be a kind of balance on, you know, how much can you pull back, but still stay competitive in whatever space you're in. John: [00:56:12] Yeah. So what, what we're doing right now based on those lessons is we tried to do like a six to eight week burst of work.And we would always try and take a week or two wiggle room between that and starting something new to look back at what we just built and make sure we're happy with it. But also look at our, our, our technical backlog and. See if there's anything there that's really pain, you know? And just, even for example, this week, we, we noticed an issue with a lot of builds failing on our CI because of, uh, how, how it was set up to push Docker images.So, okay. Usually they would fail and that was actually a real pain point for us. Just over the last couple of months, because maybe a deployment, which should take 20 minutes was taking 40. Cause you'd have to re trigger it. So that's just like, that's an example of us looking at what, what was high value and making sure we just fix it before we start something new.Jeremy: [00:57:08] So making sure that you don't kind of end up in the same situation where you started, where. These technical issues sort of build without people noticing them instead kind of in shorter iterations, doing sort of a sanity check and making sure like everything is working and we're all going in the right direction.John: [00:57:27] Yeah. It's about the team. And I mentioned before, it's about, you know, the leadership and a group of people together. Talking through common issues and, you know, maybe meet, meet every two, three weeks. Talk about some key metrics in the system. Why is it this too high? Why is this too low? You know, you can throughkind of through your peers you can really see the pain points and, and they'll, they'll. More than likely tell you them. Jeremy: [00:57:51] When you look back at all the different technologies and processes you adopted, did you feel that any of them had too much overhead for someone starting out? What was your experience in general?John: [00:58:04] So some people just didn't like doing code reviews. Some people just really just felt that. They could just push, push what they needed and that it was almost a, a, a judgment on them in terms of the code review process, which it totally wasn't. I would say, uh, some people found Jenkins and continuous integration a bit, you know, what's the point.And so we, we had had some, you know, some pain points there. Um, but as we got to Docker, as people seeing the benefits of, of these things, you know, less bugs going into production, uh, less things, breaking people, being able to go home nice and early in the evening and not be woke up in the middle of night with, uh, you know, an outage call.Those were all the, the, the benefits, and that's reaping the rewards of, of thinking like this. Jeremy: [00:58:56] Your team was bringing on a bunch of new things at once. What was your process for adopting all these new things without overwhelming your team? John: [00:59:06] So it was starting at the foundation. So the continuous integration, the code reviews where we're incrementally brought in and we had regular team meetings to discuss pros and cons.And it was really important for people to, to input on those things rather than to, to, to just implement them. They would have failed if we hadn't done it like that. It took time. I know it's still, I would say we're still not in a perfect world, but it's about group, group consensus and making sure that everyone everyone's bought in to what we're trying to achieve.Jeremy: [00:59:39] So basically getting everyone in the same room and making sure they understand what exactly the goal is. And everyone's on the same page. John: [00:59:47] Yeah. So we tried to make a big efforts, uh, particularly for people who are working remotely to get them all in the same room. Once a quarter, we talk about our challenges, talk about our goals, talk about our values and make sure we're all on the same page.And sometimes we tweak them and you know, that's how we feel. It's best to do it. Jeremy: [01:00:09] Finally, I want to talk about what's next for, for Phorest. What are the remaining pain points you have now. And what do you plan on tackling next? John: [01:00:20] So right now we're on 4000 salons on our platform. We're really happy with the state of the infrastructure to get us to maybe 8000-1000 salons, but we need to be really conscious of the company's growth and our goals.So we need to make sure that we can scale at a much bigger level. And we also need to make sure that our customers aren't affected by, uh, our growth. We were looking at serverless for any kind of newer pieces of the product to see if they can help us reduce costs even more and, and help us stay, stay agile in terms of our infrastructure and how we roll out a couple of years ago, when we launched into the USA, we noticed we, um, It doubled our overhead in terms of infrastructure, operations, and deployment.And as we grow in the U S we, we need to be really conscious of not making eh any, um, I guess, uh, mistakes from the past. Jeremy: [01:01:15] So you're mostly looking forward to additional scaling challenges and possibly addressing those with serverless or some other type of technology. John: [01:01:27] Yeah. So, um, one area in particular will be our SMS sending.So that's kind of. A plan for the next six to eight months would be to make sure that we can continue to scale at the growth rate of SMS and email sending, which is, is huge in the platform. Jeremy: [01:01:44] Um, you said so far, you've been experiencing 30% growth year over year. And you said when you moved to the U S you actually doubled your customer base?John: [01:01:56] I'd say we doubled our, uh, overhead in terms of infrastructure. managing (?) deployments. We we're still very early stage in the US and that's our big focus for the moment. But as we grow there, we, we need to be, I guess, more operationally aware of, of how it's, how it's going over there. There's a much bigger market. Jeremy: [01:02:17] To kind of cap it off.How, uh, how can people follow you on the internet? John: [01:02:21] Sure. So you can grab me on Twitter at Johnwildoran , J O H N W I L Doran. And if you ever wanted to reach out to me and talk to you about any of this type of stuff, I'd love to meet up with you. So feel free to reach out.

Oct 7, 2020 • 1h 10min

Building Maps using Leaflet

We cover:Choosing Leaflet vs other mapping librariesSources for mapping layersUsing GeoJSON to store dataRaster vs vector dataWorking with live positions such as car dataPicking a database with geospatial queriesUsing frontend frameworks with LeafletLeaflet pluginsDesktop vs MobileHis work on the Leaflet-Geoman pluginRelated LinksLeafletLeaflet-GeomanGeoJSONGeomanMapboxThis episode was originally on Software Engineering Radio.TranscriptYou can help edit this transcript on GitHub.Jeremy: Today, I'm speaking with Sumit Kumar. Sumit is the head of engineering at Share Now, which is previously car2go. He's also the creator of an open source plugin called leaflet-geoman, which provides drawing tools for the leaflet mapping library.I'll be speaking with Sumit about his experience working with leaflet and how developers can build mapping applications with it. Sumit, welcome to software engineering radio.Sumit: Hello. Thanks for having me.Jeremy: So the first thing I'd like to start with is for people who aren't familiar with leaflet, what is it and what types of projects would people build with it?Sumit: So leaflet is basically a mapping library, a JavaScript library for, mobile friendly interactive maps. That's how they say it. And if you want a map and you don't want to use Google maps, for example, for whatever reason, leaflet is basically an open source alternative to Google maps and other mapping providers. Of course, you still need a map itself. So basically the images of, of a map where you can have open street map or any other provider, but with leaflet you can. , if I would have to describe it, you basically create the layers on top of the map. So that means polygons, markers, points of interest to show any data that you might have and to zoom around the map and stuff like this. So it's the library around the map itself and gives developers really good tools to do their own mapping solutions.Jeremy: I'm sure a lot of people are familiar with Google maps. What are some of the main reasons you might not choose it?Sumit: So in big corporations a lot of companies don't want to be tied in. There are licensing reasons especially in Germany or in Europe, there are data protection reasons. There is simply a stigma attached to Google in that sense. Then there are Google uses a custom format for the geo data and leaflet you can use an open standard called GeoJSON.You can use it with Google too, as far as I know, but it basically will transform everything to the Google format and it's much more expensive. That is also [a] reason, especially for startups or, individual developers with side projects. If they have a big volume, if they expect bigger traffic, then Google maps is quite expensive.Jeremy: So what are some examples of sources people would use?Sumit: Yeah, that's a good question. Normally people use openstreetmap which is free to use. But I think the Google street maps are not very pretty. So all the projects that I do, even if it's open source, I use Mapbox. Mapbox is a company that I would say it's built on top of leaflet.Last time I checked even the maintainers or the core maintainers of leaflet work at Mapbox. So the company grew out of leaflet, from my understanding, and it's highly compatible with leaflet and they provide beautiful maps. This is not free. So I pay for great maps to use even for my open source projects.But you can use basically any provider you want. So there is also HERE maps. It's a European provider. There is Google maps. Of course you can use that. I'm not entirely sure about Apple maps if you can use that with leaflet, but any provider where you can fetch the tiles you can implement it into leaflet and leaflet is a mapping tool. That means it's not about only about like satellite maps or street maps. You can also use indoor maps. You can use a map of the Moon or Mars. You can use maps of video games or Game of Thrones it doesn't matter at the end.Any image that you can take from an area can be a map. And this could be even a (blueprint) a map of your house. You could even use that as a digital map and create whatever you need on top of it.Jeremy: And in that case would you have a JPEG or a static image and then you reference that so someone's able to click around that?Sumit: I've never done a Tile layer myself. But I've seen an application built with the library that I maintain. And this is a SaaS application for construction sites. So they fly a drone on top of the construction site, take, some HD photo of it, or even 4K, I'm not entirely sure.And then they create a map out of it. That they then draw on top of on the construction site which is really interesting. And as far as I know, it's, it's quite easy for them to make a tile layer out of it. The biggest problem with a tile layer is always zooming in and out of the map. So you need different distances basically.And of course the file size. So if you do an HD Photo and you opened a map and it has to load multiple HD photos. This creates quite some load and is slow for the user. So this optimization is the hardest part. But other than that, if you research how to do that, I'm pretty sure there are tools for that to make it quite easy.Jeremy: When you talk about tiles for a map is it where you're starting with a really high resolution image and then you're cutting it into a bunch of pieces so as the person zooms around the map they're not having to load that entire image?Sumit: It's optimized for exactly this. So it loads as many tiles that are visible onscreen. So if you zoom in. Then it only loads the tiles that are visible. These are usually I would say six images or so. And they should be quite, quite optimized in size.If you have a huge image that covers a big area. If you zoom out then instead of loading 1000 tiles, you load only six tiles that show more area but in less detail. The load for the user is basically always the same, doesn't matter which zoom level, and you start with a high resolution image and then cut it down.But again, again, I've never done it myself so not 100% sure on that.Jeremy: When people talk about map sources, sometimes they talk about raster tiles and sometimes they talk about vector data. What's the main difference between the two and when would you use each? Sumit: Oh, the vector data itself, you can think if you are a front end developer, you can think of it like an SVG. So it's an image versus an SVG. An image has a fixed dimension width and height and if you zoom in, the quality gets lower. If you zoom in with an SVG, it's rendered by the browser so it has infinite scalability in a sense. And with, vector maps, it's the same. I'm not sure if that is a bit of a stretch to mapping experts, but, the good thing for me, if I use vector maps, you can zoom in and out very fluently. There aren't particular steps to the next image it's a smooth transition.And, especially if you use something like canvas mode in leaflet, it's a much smoother experience for the user. But. These are not individual images anymore. So I'm not talking necessarily about the tiles right now, but about the layers you put on top. So let's say you have markers of Tesla superchargers for example, and you put them on the map.These can be individual DOM elements. And if you have 6,000 of them, it creates a lot of load for the browser. But if you have something like canvas mode it's drawn inside the one element and everything is smooth and performant again. You have the downside of you cannot interact with the DOM elements individually, of course.But I might getting a bit out of your question right now because, vector maps itself, can be... If you refer to tiles, there might be even some additional advantages that I'm not sure yet. What I do know is that Uber, for example, created their own library also on top of open source products. I think it was on top of Mapbox. And they use only vector tiles like this because they have huge data needs and they need very performant maps. And if you use leaflet only just out of the box and you have big, big, big data then you might get into performance problems. Problems that I have myself but not yet solved in particular apps that I built.Jeremy: And when you're talking about these big sets of data, is this the mapping data in terms of things like streets and locations, or is this more overlayed on top of that?Sumit: That's definitely overlays on top of that. So I can do some examples. II'm working in the mobility sector of a car sharing company that's also how I got into leaflet 6-7 years ago because they needed a geospatial data management tool basically that I've built. And, the data that they use are, points of interest, electric charging stations. We have parking zones, drops zones for the cars. We separate the city into polygons to track demand in specific areas of course. And then you have other companies like (masque?), which is a local logistics company. They have zones where their ships drive and stop off the harbors. Tesla for example has supercharger stations, which is small data compared to that and ridiculously detailed data if we are talking autonomous vehicles. Then you need data like you separate the road into different lanes. For example, one goes in one direction, the other one in the other direction. You have parking spots and this for a whole city or a whole country, this is data that is so big and so much your browser would just crash if you try to display it in leaflet alone.Jeremy: Hmm. And so the very simplest case. If someone is trying to put on a list of parking locations or superchargers what does the API look like that for leaflet? Are people calling functions that are adding these one by one or is it syncing to a collection? What does that look like?Sumit: so there are multiple options in leaflet, which is quite cool. So normally what I do is I try to use the data always in GeoJSON. So I store it maybe in a different format, but in general, in the APIs I build around leaflet use GeoJSON normally and leaflet you can add GeoJSON simply.So there is a GeoJSON method where you just put in data and it displays it on the map. You can also create your own shapes. That means a circle marker, a circle, a polygon, a line, a marker. Then you basically give it two coordinates and it creates the shape on the map.With GeoJSON it's a wrapper for everything. So. If you provided a GeoJSON, it can be markers, can be polygons, can be lines, and leaflet will just add everything. It depends on your needs or your source of data that you have. And particularly with my library where I create the shapes so the user can draw the shapes, him or herself. I need all of those functions. Basically.Jeremy: And, you were saying that GeoJSON is a good option for storing the locations of different things, storing shapes. How about data that changes often, like if you have a car sharing application you might have live locations of where cars are. Would GeoJSON be suitable for something like that?Sumit: So I'm sure there are people that disagree with me. I had discussions actually with developers from Uber and also some, let's say, mapping experts from other companies, and the opinions vary. I can only speak for myself. even though JSON or GeoJSON, might not be the most efficient storing format out there, it's basically a standard in modern APIs to use JSON or GeoJSON as as the format to exchange data.And I like to have not a lot of transformations with my data, so I store it as much as I can in GeoJSON. It works fine for me. I don't have any restrictions or downsides for that? I don't build applications on a scale of Uber, but Uber also told me they store in GeoJSON and they don't have a problem with it either.They like to use open standards and I agree with that. And if we are talking live data it doesn't matter how frequent the location, for example, of the marker is updated. Let's say every half a second, you update the location of a moving car.You simply store the two coordinates into it and you can build it as a, you store the entire GeoJSON again or you store only the changed coordinate. Really the subset of the actual data doesn't matter how you build it. It's so small in comparison that it will only make a difference if you have, I don't know if you update a thousand or a million entries, all at the same time.Then of course, your network is gonna slow down a little bit, but this has nothing really to do with GeoJSON or JSON, this is any data. If you have so much data updates you should look at something like a message queue like RabbitMQ or Kafka or something like this. But if you do it just over REST API, I would personally batch the requests.So every second or every two seconds, I would update all the entries that have been changed. If we're talking about a front end or something and batch everything together.Jeremy: So to make sure I understand correctly, if you were getting the locations of cars, and you query the backend API, you would get a GeoJSON file, which is basically a regular JSON file that has a bunch of elements that might have the latitude and longitude of each car. And as you got updates, maybe you get an update every second or every few seconds. You would receive new GeoJSON files from the server and it would just have the elements that had changed or new elements?Sumit: So, there are two different, points in let's say an application stack, where you have these updates. The first one is the car sends an update to your backend. And if we are talking about our company, for example, share now, or, companies that use my open source product, these might be, for example, 10,000 cars always connected to some sort of backend, and they send their location, maybe every second. For example, let's say you have a network hiccup and they all reconnect, at the same time, you have a huge spike because 10,000 cars send their location at once, and this is an area where we at least had the experience... Don't use just HTTP use a message queue. So we can handle all of the data because the cars do not only send a location, they also send is the window down, is it up motor, start motor, stop events or everything that happens in a connected car is sent to a server.And this can be multiple events per second. If we're talking 10,000 to a 100,000 vehicles. Or scooters or whatever it is. This load should be handled by a message queue like Kafka or Rabbit MQ. Once it's in the backend and you just want just in quotation marks, want to display it on a front end then we are talking a different story here because, front-end doesn't have a message queue like this we can use if you want to do it really in a real time sense, you can use WebSockets, which is... Yeah, you send incremental updates as they come in to the front end and change your data and can be basically an ID and a new latitude and longitude and you connect the data and the front end.It can be new JSON. It's completely dependent on how you build it. I normally send some metadata with the JSON that is, for example, a name of the license plate, name of the car, maybe the model, is this a Mercedes or BMW or whatever. And maybe even some other data depending on the project, depending on the data that you get.There is maybe a lot of metadata involved. Then I don't send the entire JSON. Then I just send the updated information. But if you are not using a web socket, if you're using HTTP, then there is no push from the server. The browser has to fetch updated information. So in this case, and what I would do is, let's say I display Berlin on a map and we have taking as an example, share now and car2go here. Of course, can be applied to any company out there. If we are talking 2000 cars, for example, and I want to show them in real time, I will simply fetch the list of all the cars, every second or every two seconds. So there is no push or anything. I would just do a fetch a simple, interval request.Jeremy: And so you're doing this fetch on this interval, and is it taking into account what it chooses to display? It's just taking that entire document and showing everything that you're sending, or is it performing a diff, comparing the elements you already had and the new elements?Sumit: Yeah, that's a good question. So I do a diff, but it depends on the project. So I am building a little SaaS product around geo management basically. And there I use diffs. I would have, yeah, I did a mistake of not, not diffing. and the reason is if you redraw the layers on a map, it's a lot of performance overhead.And if you do this every second and currently you are maybe creating another layer on the map, you are interacting with the map in some other way. This, this is a performance problem. Also, the map constantly rerenders basically your data. And the map simply gets laggy. And if you want to do anything else on the map it gets laggy and it's not a good performance.So I use diffing that if out of 2000 cars only two move, then they will get updated and the rest is just thrown away.Jeremy: And internally. So when you're talking about using this diff, does that mean on the leaflet side or on the JavaScript side, do you have this GeoJSON document and then you're updating that GeoJSON document and leaflet just figures out that only those two elements changed and it doesn't touch the rest of the page?Sumit: Yeah, that would be great. But sadly, as far as I know it's not like that. So you give leaflet something to draw and they will draw it. So I do the diff, basically on the business application side, on the business logic side, I do this myself. I'm diffing, when I get the payload basically to the payload that is already there with lodash do a comparison, remove everything else.And, then I have to remove the layers first that are currently in leaflet, and then I feed them the new ones. And I also have my own IDs associated with everything. So I can compare also individual shapes on the map or layers on the map.Leaflet has IDs that they create when you add something to the map. But they are not consistent. So if you add a marker to the map it has an ID. If you want to add the same marker to the map again with a different location it gets a different ID. You can get the current one on the map and update that one. That is also possible, but then you would have to buy in completely into the leaflet logic, including the IDs and how to handle everything. And I like to separate... I like to have leaflet just as a dumb drawing tool and not as the source of truth for my IDs and business logic and all of that.So I do this outside. I do the diffing outside. I do the storing to the database outside, and leaflet is just a component. I feed it data, it draws it, and that's, that's about it.Jeremy: Mm. So in situations where you have live data and you have a lot of data changing, it sounds like you actually aren't using GeoJSON in that case, you have your own collection outside of leaflet. And you're using that along with your own diffing to determine, which elements you should find inside of leaflet based off of a key.And then, moving just the ones that you've found that have changed?Sumit: No, it's GeoJSON. It's all GeoJSON that I store. For example, the SaaS tool that I built has an API where you can fetch the layers that are displayed on the map, from an API And this is all GeoJSON. And then internally, I only use GeoJSON. And that meansif you put it into leaflet internally in leaflet, it's also transformed to something else, of course.But the map, you can just say, okay, give me these layers and make them GeoJSON basically. And then this is how my whole application uses it. but GeoJSON has a properties block, and in there you can add all the metadata you want. So in there, I have the IDs that are basically the reference to my database in there.I have, a gravitational center of a specific polygon, for example. I have descriptions, names, anything they use, I would like to see I store in the properties of GeoJSON and and use that everywhere. So it's still all GeoJSON.Jeremy: You have let's say a single GeoJSON document for all of your car locations. And if you go inside that GeoJSON find the key of a car that you want to move, you can update the location through the GeoJSON?Sumit: That's actually a pretty good question and just depends a bit on the application that you're building. So GeoJSON allows you to group layers together. That means if you have one polygon, cool, you can have that. If you have two or three or a hundred, you can group them in what's called a FeatureCollection.You can store it this way, that's completely fine. Then you have big payloads, but you have to get into the GeoJSON and get the specifics out of it. In the SAAS application, I have this use case to count and limit the layers that a user can draw on a map and this means I want this to be separate database entries.I can also choose to share one layer versus the other with the public or with the colleague or whatever and to have this granular permission system I need this to be separate database entries and that means... Each layer on the map is for me personally a different GeoJSON. But again this is highly customized to my use case.I'm sure there are users out there if they fetch my data from GeoJSON through the API. They would not like to have a thousand different GeoJSONs in an array or a collection for all the markers. They would just like to have one big GeoJSON file that owns all of the data.This is easy to do. You can just combine them. That's fine. And I will do this on the API level. But for my use case, I store them separately.Jeremy: Okay. So when we're talking about GeoJSON in a lot of cases you actually choose to have a separate GeoJSON document for each marker or each thing that you're going to put onto the map.Sumit: Exactly.Jeremy: We were talking about how you're bringing in data on the backend and then you're bringing it in as GeoJSON on the frontend.Is the GeoJSON, is that something that you're converting at the API level or are you storing your data as GeoJSON on the back end as well?Sumit: This is the embarrassing part of the application. When you start out, and I'm building this application for a long time and there is tech debt of course and that is one thing.So what got me up fast? I use Firebase as a data storage, or firestore it's called. And this just helps me ramp up an application quite fast. And I still use it even after many months, I think a year now that I'm building this application. And, in there, you have a limitation of the documents, how you store it.The document format is limited. So I cannot have nested arrays, but, GeoJSON has quite a lot of nested areas, especially if you have. Multipolygons. That means multiple polygons belonging to one layer, acting as one layer, basically with holes in it. For example, this does some multiple levels of nested arrays.You can't just store this in firestore. You have to store it as a JSON string. So basically this means is I store my GeoJSON as strings currently, that means on a database level I can't do any queries. Anything I do with the data I first fetch it via cloud functions or a node script or whatever.I fetch the data first. Then I do the calculations on top of it. For example, is a point inside this GeoJSON or are these polygons near the user or whatever? I do all the calculations and then I serve the data as GeoJSON of course. So I store it as GeoJSON in a sense, but it's a string.And if I would have to do it again, or if I look into the roadmap for the future, I would probably go to something like MongoDB where you can, I'm not sure if it's like native GeoJSON support, but the important part is that if you build an application that basically does a lot of of geo queries and stuff like this.If it's made to handle spatial data use a database that can do geoqueries on a database level. That means I give a database a coordinate like my location and do a query like give me all polygons. That where this point is inside a polygon this is something. If you can do this on a database level, this adds a lot of performance, because otherwise, if we are talking thousands or millions of layers, you have to fetch everything and calculate this on your backend, on the backend side.And this is expensive. So I would go to a database that allows geo queries, Firebase SaaS. They have this in a very, very limited format. I asked this on stackoverflow. I saw the question yesterday again. I asked this about two years ago, 2016 and back then they answered that they haven't exposed their geo queries yet and they still haven't.So I'm not sure when they come out with it, but if I needed I would switch databases I will not wait for that. And if I would start over, I will choose a database beforehand that allows me to do that.Jeremy: And for your work at share now, how are you storing the geo data there?Sumit: So, I have not touched that particular product in quite a while. But, back then when, when we built it. It was GeoJSON, but we stored, we used Mongo DB, and I'm not entirely sure if you can simply without any transformation store GeoJSON in Mongo DB, but Mongo DB has geo queries. So, I know that much, but I can't tell you right now off the top of my head if you have to transform it or if MongoDB accepts GeoJSON but transforms it internally that I don't know.Jeremy: We're talking about how you can have an application with a lot of data but depending on where the user is looking on the map they don't necessarily need to bring in all that data at once. What are the strategies you use to deal with that?Sumit: That is a good one. So what geo queries often allow you to do is if I look on a map you can define the borders by basically the top left, top right, bottom left, bottom right corner of your screen. So if I have a map on my screen leaflet and any mapping library can basically give you the boundaries that you are currently looking at.And these are these four coordinates. And if your database supports geo queries, you can basically say give me all layers that are inside these boundaries and then you can just fetch that data. This is how I would do it from the top of my head. Honestly, I've not built it like this because I've not had to deal with those big datasets.And there might be downsides to this. If the user for example moves the map how many queries do you do? Then you have a moving target, basically. But I'm sure there are ways to solve this. So this would be my first clue on how to achieve this.Jeremy: When you're working on a leaflet application, do you have any experience integrating a leaflet map with other frameworks like React or Vue?Sumit: That is a very good question. I get this quite often on the open source library as well. And with mapping libraries like leaflet you are basically back to the roots of HTML and JavaScript where you interact with the DOM elements directly.And with the introduction of frameworks like Angular and React and now Vue we moved past this. These are all abstractions to the DOM element. We say create this div and create this div. We don't do this anymore. But with leaflet you still at least a little bit have to do this in the sense that, for example, you tell it the div that it should render in and you call a specific element directly, sometimes through the leaflet API of course, but it doesn't have this reactive nature like our frameworks today do. So what happens is that there are a lot of abstractions to mapping libraries. So there is react-leaflet, vue-leaflet, all these abstractions, to use the mapping library in the same reactive way, in the same thinking mode of the framework, which, which is good for anyone that is starting out.I personally like to use it bare bones. I like to interact with leaflet directly with the API it provides. I have full control over it. Maybe I'm an advanced user in a sense. But the abstractions that are out there, they limit me because I always have to go through that. If there is a new feature or an edge case I want to tackle or something I always need the buy in of the maintainer of that abstraction and this is something I don't do.Sometimes I build my own abstraction but normally I basically build a component in Vue and it doesn't matter if it's React or Angular or Vue. I build a component that owns the map itself. And as a property I put the GeoJSON in there, for example. And the component does all the rest, maybe the diffing even, the rendering, the updating, the watching of the properties.But you have to build your watchers yourself and stuff like this. So it's just out of context. Not every developer is as comfortable with this, especially if you start programming and you started in the world of React so you're not used to this barebone coding.But if you use other libraries that are interacting with the DOM elements directly then it's a good exercise to connect these two worlds. Another use case might be where you could use your skillsets, in the sense are, charting libraries. So if you display charts for example on a dashboard, it's the same, same topic there.Jeremy: So basically you recommend keeping your framework code separate from your interactions with leaflet. You'll use leaflet's built in APIs, not use a wrapper just because you want to make sure you have full control and not be limited by that wrapper.Sumit: Exactly. And if there is a new version of leaflet coming out, I want to use it immediately. It probably has some updates that I want to put in. And again, I can control the user experience much, much better if I can interact with leaflet directly through the APIs.I wouldn't necessarily say I would recommend it. It's the way how I do it. And I think if you build an advanced, a big application, I think you are better off using the leaflet API directly. If you just want to display a map and maybe a marker on top of it to show the location of your business on your business website, then it's totally fine to use a wrapper, right?That's easy. You will have it done in less than an hour and that's it. But if you build a big application to handle geospatial data. Then it's a different story. I would say remove the abstraction and go directly with the library.Jeremy: For these components that are from your framework do those end up rendering surrounding the map? Like you would have a sidebar or a top bar or something like that?Sumit: Yeah. Also also a good question. So on Geoman, I use it. The SaaS product is also the demo website basically for the open source library, and it's called geoman. And there, I have a map component as I said, and it's basically fullscreen. From the Dom site. Of course, I have stuff around it.I have a sidebar in a sense and a footer and whatnot. But with CSS I just display it differently. So the map is for me, always front and center. It's the biggest element. And it should be basically like Google maps. It should be the one big background thing and the header and the sidebar, just like on top of the map.And you can build this in different ways. Like you could even add this to leaflet you could do everything in leaflet, you could even add your sidebar as a map element on the map but I don't do this. I have the map container and I just overlay my other Dom elements on top of the map, so I want to keep this really separate to the map itself. The map component is definitely one separate component for me that I interact with. Yeah. Because of lock in. Again, let's say I want to change mapping providers at one point instead of changing one component, I would have to change I don't know the whole application. So I try to keep this as separate as possible.Jeremy: So if you were to look at the HTML the DOM nodes that have all your controls are your sidebars, things like that, they would be completely independent of the leaflet DOM node, but you use CSS maybe with a fixed position or something to appear on top of the map, is that correct?Sumit: Yeah, that's true. I would do one exception though. I build an open source library, as you mentioned, for drawing, basically layers on the map, markers, lines, whatever. These buttons I add them to leaflet. So because it's a leaflet plugin, that means any user of leaflet that wants to use my plugin just imports it.And it adds these buttons to the leaflet map. I basically used the leaflet framework to display this, but anything else, like for example, an address search, the list of the layers that is currently on there, an export button for downloading the GeoJSON, changing the tiles from street view to satellite view, this stuff you can all add to the map.And a lot of people do that, but I personally keep this out of the map. I have this separately again, because In my SAAS application, I would like to have more control over the user experience. And maybe I have something like a pay wall in front of it or an upsell button or I need to limit this based on user permissions.And I simply feel more comfortable doing this in a framework like Vue instead of leaflet itself.Jeremy: You were talking about how you built a plugin.Can you talk a little bit about what makes sense to be built as plugins versus built outside of them?Sumit: Yeah. Also very good question. I thought of that as well. Sometimes my thoughts went if you have the stuff I just mentioned, an address search and the list of layers and stuff like this. I thought about, Hmm, what if everything of that is a plugin for leaflet, I could open source everything. I just add everything modular to leaflet. It's a big lock in into the leaflet ecosystem. So this is all possible and there are plugins, I think for all of that, there are plugins for your address search, there are plugins to switch tiles, plugins to draw, to export data for all of that.There are plugins already and they add this to the leaflet map. Again, I think if you have a a use case of just having a small map showing some small stuff, or you want to create an MVP, meaning a small app that can do something where, yeah, it's not a product that you might sell, to someone.You can do all of that and it will be completely sufficient and it's probably less work. To just use a plugin to add this to the map. But, if you have a product and you want to control the design aspects of it a lot and maybe animations and stuff like this. I think for every developer it's more convenient to not fiddle in leaflet code for that. Leafet has its own CSS also. So you have to overwrite this or compete with it over the style sometimes here and there. And keeping it out of it just means the data that is displayed and sent between these components has to flow through somewhere. And if you build a big application, you probably have an application state, a state, somewhere like in react, what is it called? Redux, for example. Redux store where you have your data store on the front end. If you use this as your source of truth, you can use it for all the components that you display on top of the map that are not inside leaflet. And it's much easier using that ecosystem then recreating everything in leaflet because if you do that you basically don't need a framework anymore.You build everything in the leaflet library and you interact with the DOM again directly. So the abstraction that we mentioned before you would have to build this for basically everything, right? And there is no reactive thing anymore. You don't get the benefit from React anymore if you add everything directly in leaflet.Jeremy: If you look at the leaflet website, there's a long list of plugins. Is there a set of plugins that you typically use when you know you're building a leaflet application?Sumit: No. Somehow I'm ignorant in that sense. No, not even the address geocoding thing. I also built this on my own. There are many, many, many good plugins there honestly and what I can tell you is there are plugins that we use at SHARE NOW or Car2Go and there are plugins that I constantly see people use together with the library that I am providing and these are something like, what's it called? A marker grouping tool. It's...Jeremy: Clustering.Sumit: Clustering. Yes. Thank you. Marker cluster is a plugin that is used a lot and we use this as well. As the name implies it clusters the markers when you zoom out, so you don't have 10 million markers on a map, but it clusters them together.It's also good for performance, but also for usability. so this is something that is used a lot. And then of course everything that creates heat maps and data visualization are also used a lot, at least from the circle of open source maintainers or open source users that I'm interacting with a lot.But honesty, if you are using leaflet, look at the plugin library, see what is there before you build it yourself. Try the plugin and see if it fits your needs because you don't have to reinvent the wheel. For me, this might be a bit of a sickness and it's fun to recreate the wheel for me.But of course it's a waste of time if you have it already there.Jeremy: Yeah. When we talk about displaying things on the map. We were talking about cars earlier. And you're saying you display them in the form of markers. A lot of times applications want to show information that's associated with those markers, whether that's a label on the map or something else like that.How do you typically approach that? Keep all that information grouped together and keep it updated.Sumit: So I saw different implementations. You can, as always, the metadata that you store with your layers. You can choose to have them its own entity outside of the GeoJSON, or you add it to your GeoJSON. And so the question is, is the marker itself, your source data, your main entity that means... do you, do you associate the style for a marker with the marker or do you store the style separate and store the idea of the marker in the style that is basically an architecture question that you have to solve. An example that might resonate maybe more with everyone is we have cars, right? So we have cars and the cars have metadata like license plate, which tires they have, which SIM cards they use, they have all this, a lot of data, much more than the geo data. And so is the location of the car just the property of the car? Or is everything is from the car metadata of my location data?And this is a distinction you have to decide on your own, how your complete architecture, landscape and ecosystem looks like. For me, I decided many times that the location data is the source of truth, let's say. And I put a lot of metadata in. That is because if everything the user does and everything my API does resolves around the location data I can just as well as use that as my main source of truth.So if we go back to styling. Styling in particular I would store as metadata. So you have a marker and it should resemble a car. So I want a different icon. I maybe want a direction. If we are talking apps like Uber for example, you see the direction the car is heading. So, I might have a degree so I can, I know how to turn the icon on a map. I need a color, and maybe I wanted to display the lane that the car was driving, the route. So I want to display that and all of this data I store as metadata. And if the user changes that, I change it in the metadata of that marker. That means if any team is fetching the data from the API they have also the information how to display it. And this was for me and for us also very important to do. So styling, I would definitely recommend storing this with the geodata as metadata with everything else. Like the fuel level of the car. That decision you have to do on your own architecture wise.Jeremy: If you wanted to make a customizable user interface in terms of somebody chose, they wanted to see what's the speed of the car or something like that and they want to be able to turn it off and on, you would store what the user chose in the same place.Sumit: What the user chose in the sense of, okay, I want to see the fuel level. I want to see the license plate.Jeremy: They want to be able to turn it on and off.Sumit: Yeah. So that is a big use case. I think that's the use case for everyone that displays markers. Let's use Tesla superchargers for example. You have the location of the supercharger and if you click on one, you have the address, the exact location, how many are there? How many are currently free? Blah, blah, blah. All of all of this data like you can you, can you eat there? are there restrooms, et cetera. So Tesla, they don't give you the data as GeoJSON. They give you a collection of superchargers basically an array with objects. And just the location of the supercharger is a GeoJSON.That means you have a big collection of JSON data and a subset property of each entry is a GeoJSON. That is also how we do it at Car2Go and SHARE NOW. GeoJSON is just a part of a bigger, dataset. But in Geoman, for example, I reversed that. I use GeoJSON as the main source and have.Basically most of the data as meta data inside the GeoJSON, it doesn't make a big difference to anyone. It's just a preference of how you manage your data. And again, what is your main entity? Is your entity a car that has a location property, or is your main entity a marker and being a car is just a property of that marker, could also be a plane or user.Because I wrote an application where everything revolves around location data, I chose the location is my main entity and that is the GeoJSON and the fact that it's a car, it's just metadata, a property of it with a specific icon basically, like my application doesn't care if it's a plane or a car. At the end of the day.Jeremy: I guess another thing I'd like to ask about is when people use maps, a lot of times they're on their phones as well as just being on the desktop. what's your approach or what are things to watch out for when building a site that needs to work on both desktop and mobile?Sumit: Yeah. Very tricky. Very tricky. So, two things, if you just want to display it. So again, the use case of you have a location of a business or something that you would like to display. That is no problem at all. Leaflet is very mobile friendly, and if you just display a marker the user can zoom in and out, can easily scroll through.So if your thumb goes on the map and you scroll, you might have seen this with Google maps right in Google maps to zoom or to move the map, you need to use two fingers. They do that so you can easily scroll through the website without, You know that you are scrolling is interrupted by your moving the map instead.So these kinds of functionality to make it mobile friendly is there in all of these libraries. So this is an easy thing. You don't have to do anything. It just works out of the box. But if you create more advanced interactions with an app, like for example, a user should be able to draw a polygon.Then it gets more trickier and geoman and also my library leaflet-geoman they both work on mobile. I think that is also one reason why it's getting quite popular because from my research, it's the only drawing tool that works on mobile. But if you use your thumb, it gets in your small screen, it gets less precise your drawings. So I have features like pinning. That means if you click near a different marker, it just snaps them together. So it assumes you want to place them on top of each other. That's a really cool use case, especially on desktop where you can easily move to markers. But on mobile, it's not that easy.So. I personally think it's possible, but it's not a good way to do that except for let's say you have an iPad Pro with a pen. Then it's really cool. I was really impressed when I used my own library for the first time on an iPad Pro with a pen, because then you can draw really precisely. It's a lot of fun.It's easy to do. that's really cool. So if you have advanced interactions with mobile. Then you will get into situations with click events and stuff like this where it gets maybe a little bit more tricky and use cases where you hover the mouse over the map and you display things or you give the user a hint of what will happen when they click.You can't do this on mobile. It's basically a different experience and in a sense a limited functionality if we are talking about drawing, So this you should keep in mind. If it's strictly about displaying data, you will not have a problem.Jeremy: And in terms of the UI, it may need to be significantly different, right? Between desktop and mobile, do you effectively, hide a lot of UI elements or build two components depending on what size the viewport is?Sumit: Yeah. Yeah, of course. So in Geoman, I use the sidebar for example, right? If you display it on mobile, you don't see the map anymore and it's just a lot of data. So the more data you show and the more metadata you want to display the trickier it gets. How to display this on mobile. And the map should be front and center, especially on a tool that revolves around that.So that means if you open geoman.io on your phone, you only see the map and you have a small icon on top that moves in the sidebar, for example. it's still not perfect. I think it's a limited use case also, my particular app to use this on mobile. But of course, you have the same problem with any website that displays a lot of data.You have to hide some, you might have to, basically also reduce some data like some data that you just don't need on mobile. you will reduce maybe the table, columns, you know, stuff like this to make it as easy as possible for the user on mobile. And they can still open a desktop view if they want to on mobile.But it's such a limited use case. I think that, as long as it looks good and it has the basic information, I think everyone is fine. If you go into an advanced mode than a desktop might be, or a tablet might be the way to go for the user.Jeremy: Yeah. We've been talking about geoman. And you've mentioned that you have a leaflet plugin called geoman-leaflet. Can you go into a little bit about what geoman is and why you decided to create geoman-leaflet?Sumit: Yep. Yeah, sure. So just so you know, it's not a superhero or something like this, it derived from geo management, and I just noticed later that it sounds like some sort of superhero. Anyway, I have a quite, let's say, experience now in solving this for companies and seeing the problems inside companies.And not only in the mobility space, this ranges to communication providers to, again, construction sites and logistics companies and everyone that has, geospatial data. There are two categories I would say that I'm trying to help. One is they have their own application, they have their own team that works on this or multiple teams, and they use open source products or pay products and they need more functionality.So they probably use my open source library if they use leaflet for creating data. And, the open source library basically helps you create and edit this data. So you need to create a polygon. You can create this of course by just providing the coordinates. But if you have a user and he needs to, for example, create a polygon around a building or around a block or a city or whatever, my library is there to help them just easily create these polygons on the map.They can create markers, circles. rectangles, whatever it is. They can also edit each specific Vertex, add Vertexes in-between, they can move the entire shape. They can cut holes in polygons. They can of course, remove the polygon. So all of these drawing and editing features are there. And one product I'm currently thinking about, I'm just collecting feedback. I've not built it yet, is a pro version of that that has even more advanced drawing and editing features. Some companies need to have polygons that are adjacent to each other and cover a complete area like a city, and to have maybe a hundred or 200 polygons that make up the city.So just think of it like sales territories, or something like this. And that there should not be an overlap or a hole between these polygons, so they need to stick together. Now they can do this already with the open source library that I'm providing, but if they want to change the shape of one polygon, they would have to change all of the adjacent polygons as well, which is quite a lot of work.And I could build a tool, for example, where you just draw a new border and it automatically calculates. The new size of each of those polygons and adapts that because you basically tell them, Hey, I don't want any overlap and I don't want any hole in it. And these are such advanced niche drawing tools that require weeks and months of work to build, that I want to wait if people or if companies need this, and I'm thinking about providing basically a pro version of the open source library. So that means I have the open source library that I could constantly maintain for basic drawing and editing needs. I wouldn't say it's basic, it's quite advanced already. Then we have really advanced tools where I think a pro version of the library would be helpful for some companies.And then we have the second bucket of companies that don't build their own application. They just want a way to easily have a service like MailChimp, for example. Instead of managing email, they want to manage geospatial data. So they want a service where they can create data where they can put data in through an API, and where all the teams and clients and apps can fetch the data again and this is where the SaaS application comes in and it's called geoman. There you have a studio where you have your map. Or you can create multiple map like projects, and then you create, for example, one map that displays your supercharging network.You have one map that displays your sales territories or your parking spots. You can create one map that displays airlines, routes, whatever it is, whatever your data is. And for me, the powerful thing that would solve a lot of problems inside companies is any user can draw this.It can be a working student that creates parking spots, for example, in a city. Then you have your developers that not only consume the data, but they can write the data through the API and it's updated everywhere on every consumer. And you can attach the metadata to it to put everything in.Jeremy: To summarize, there's geoman, the application, which is where somebody can draw shapes or add data, probably import data from something like say a CSV or an API or something like that.And then, if you had an application that wanted to consume the data, geoman itself has an API for people to get data back out.Sumit: Exactly. Jeremy: and then geoman-leaflet is the leaflet plugin that you built and use within the geoman application.Sumit: Exactly. And it's open source for anyone that wants to build basically their own application to manage your data. And, yeah, so I got asked a lot, for example, why do I open source this? Because it seems like it's one of the best plugins out there. I don't want to, say anything false here.And I'm not sure about that, but users tell me they switched to it, for example, from other libraries. And I also haven't seen other libraries that provide the same functionality. So it's one of the more advanced ones. And people ask me, why do it open source? Why is it not part of the geoman suite?That makes it a reason for people to use. And the reason is simply that I know that Geoman will not cover all use cases that are out there. People need very specific needs and geo data. And I like open source. I love contributing to it. I love the interaction with the community and I think it's a great open source product.I personally benefit a lot from open source as well specifically with leaflet and I thought Geoman as a platform can be a way for me to earn money and invest more time into the open source product that I can then give back again. You know what I mean? I've maintained this library for four years now. I have not earned a single Euro with it and it costs a lot of time and I feel bad if I don't maintain it for a month or something. But if, if the platform is successful, then I have more time to develop it. It serves multiple purposes, not only for me personally, but also I can maintain the library much better, add much more features for anyone that wants to build their own application with the open source toolsJeremy: Very cool. So hopefully at some point you'll get to spend more time, in your day job working on geoman and working on geoman-leaflet.Sumit: Would love to. Yeah.Jeremy: To start wrapping up, for people who are learning leaflet are there common mistakes you see people make or suggestions you have for people who are learning leaflet now?Sumit: Yeah. If you go through the docs and the standard stack overflow tool that we all use every day you will get quite far. What I see a lot, especially if you use frameworks, and beginners with leaflet, they struggle sometimes to make this mental distinction between these two universes, these two ecosystems.So if you use a reactive framework, like react, vue, Angular, just know that leaflet itself is not part of that. You interact with the DOM directly as mentioned before. So don't expect like this reactivity. This you just provide something and everything changes on the map.This is something where I see a lot of questions happening and then on the other hand, the business side, so think of leaflet and mapping. Like they provide you the tools to display a map and data on top of it, but the business logic behind it, like, how do you store it in what data format or I want the tile to be red if it's overlapping something else, this stuff, you have to build yourself or use a plugin for it. It won't create business logic for you. It's basically a dumb way of rendering data. Not that dumb, but I hope you get what I mean right? You have business logic that is specific to you. And you have to build this yourself. So the library, will not do this out of the box for you. So I see these questions a lot also on the open source repository, where people just expect expect it to do things. but they can easily do this in three lines of code they just haven't wrapped their head around yet. What exactly is leaflet doing for you and what not. So if you have any business logic this is something you have to build yourself and it's honestly quite easy. But if you have any questions I'm happy to help, not only for my open source library, I'm on Twitter and everywhere, so I'm happy to help out.Jeremy: Cool. Are there any specific projects that people should look at. If they're trying to learn how an application should be laid out in leaflet.Sumit: So there are multiple demo projects for basically all the plugins. There are demo projects and also for leaflet itself. You can look at geoman.io, which is quite a, let's say, more advanced use case. Everything from Uber of course is very advanced. And especially in data visualization, they have amazing tools and it's all open source also.So if you look at demo pages from Uber and from Mapbox you will get a sense of where it can go. And then I would personally just look into the DOM and see what they do and how they do it. Maybe they have an open source repository where I can take a look.And also, for example, if you want to write your own plugin with leaflet, look at the open source code. That's how I started. Like I had no idea how to create a plugin for leaflet or how to manage geospatial data, like I just didn't know. All I had was I needed functionality that no plugin had so leaflet-draw was back then the only one. I basically scan their code, looked at how they do it, and try to recreate it and then build my own architecture with it. But it was very similar in the beginning the code. So just look, that's what open source is there for, look into the code, learn from that, clone it, adapt it, and grow from there.Jeremy: Yeah. How about in terms of like a full application, you're mentioning to look at, say geoman.io but geoman itself is not open source, is it?Sumit: No. Yeah, so I'm not sure where are to look there. what I do sometimes. Okay. You have stuff like geojson.io. it's a small utility to, to edit GeoJSON. I've built the same on geoman.io. Oh. But yeah, it's a different one from Mapbox and this is open source. It's a smaller, application that, yeah, that you can take a look at.What I can also recommend is if you for example you want to create a leaflet map and you don't know how. The demos are not enough for you. You want to see some real use cases with React for example, what I do is I used the advance search functionality on GitHub. So I look at who uses leaflet and who use react, and maybe I searched for the inception code from leaflet, and filter by it.And then you find a lot of applications that basically use it like this, and then I scan around and try to find someone that uses the same stack as me and how they do it. And this gives me a lot of inspiration.Jeremy: Yeah, that's a great tip. Not just for leaflet, but for anytime you're trying to learn a new library. Yeah, it's really helpful.Sumit: Yeah. The GitHub search functionality is underrated, I think in that sense.Jeremy: Cool. well before we finish up, is there anything that you think we should have mentioned or we should have talked about?Sumit: I think we had a quite a awesome overview. There is not much to mention if you are into geospatial data or if you have the problem to solve the problem, for your company, or a client or whatever. I think it's quite the interesting field. It's quite niche also, but yeah, the user experience is... It's amazing, and you have such a big impact. If you build something nice on top of maps. And of course it's a field that is very, very future-proof. It doesn't matter if we're talking drones, autonomous vehicles, sales, like everything in the mobility sector specifically, but everything around us needs more and more location data because everything is connected. And, I think it's a field where you as a developer, if you have these skills are very good equipped for the future. So, don't hesitate to get into it and code a bit around it. I don't think you will regret it.Jeremy: Cool. And for people who want to follow you or see what's going on with geoman, where should they go?Sumit: So, I'm on too many platforms, I say, but, I'm active, very active on Twitter. my handle is tweetsofsumit. There I'm the most active. So if there is anything you would like to ask, or if you want to follow me and you know, see where geoman is going, the open source library or also even how to create a business, like I'm not an expert in it. I just share my journey. I will share everything on Twitter. And if I have a, for example, a YouTube video or even a guest appearance on a podcast like this one, and I will share everything there. So I think that is the best way. And of course, you can go to my website, which is raum.sh raum is a German word, raum for a room. raum.sh Is my personal website where I also post occasional updates here and there.Jeremy: Very cool. Well, thank you so much for talking to me today SumitSumit: Thanks for having me, Jeremy it was nice talking to you.

Sep 23, 2020 • 1h 3min

WebAssembly on the Server with Krustlet

Taylor Thomas is an Engineer at Azure, the core maintainer of the Kubernetes Package Manager Helm, and a member of the Krustlet team.Timestamps[00:55] - Kubernetes[07:37] - WebAssembly[12:06] - WebAssembly Runtimes and WASI Specification[15:42] - WebAssembly vs Containers vs Native Binaries[25:11] - Krustlet and the case for writing it in Rust[30:52] - Missing APIs in WASI[33:38] - Wascc vs Wasmtime runtimes[38:15] - Rust ecosystem for Kubernetes and WebAssembly[40:23] - Comparing other languages to Rust[45:09] - Rust learning curve, experiences as a beginner[53:16] - Next steps for Krustlet and WebAssemblyRelated Links@_oftaylorKrustletKubernetesOpen Container InitiativeWebAssemblyWASIWasmtimewaSCCWebAssembly meets Kubernetes with KrustletIntroducing Krustlet, the WebAssembly KubeletKubernetes: A Rusty FriendshipThe Safety Boat: Kubernetes and RustA Heaping Helping of StacksThis episode is also posted on the Rustacean Station feed. Check it out for episodes all about Rust!TranscriptYou can help edit this transcript on GitHub.Jeremy: [00:00:00] Hey, this is Jeremy Jung. This episode, I'm talking to Taylor Thomas about running WebAssembly on the server with Rust. He's an engineer on the Azure team. The core maintainer of Helm, which is a package manager for Kubernetes. And he's currently working on Krustlet, which runs WebAssembly applications within Kubernetes. Also, during our conversation, you're going to hear us talk about WASM. That's shorthand for WebAssembly. All right. I hope you enjoy my talk with Taylor. Taylor thanks for joining me today.Taylor: [00:00:30] Thank you for having me, Jeremy. This is my first Rust related podcast so it's exciting for me. I'm a fairly new rustacean all things considered, so I'm happy to be here.Jeremy: [00:00:41] For people who aren't familiar with the world of kubernetes and containerization all these different things. Could you start by explaining at a high level what Kubernetes is?Taylor: [00:00:55] Yeah, so Kubernetes, is a container orchestrator. And you've probably at least heard of Docker if you're in the technology world even if you haven't used it. But Docker is a technology that made something called containers popular and useful. Those technologies have been around for a while inside of the Linux kernel that's why they call it a container everything is inside of this thing and contained in this process. It uses C groups and kernel namespaces. There's a couple of different things under the hood that are going on. But basically it allows you to create an artifact that can be bundled up into something called an image.And that image can be passed around and then be used multiple times. So often people will compare those to VMs. VMs were a big revelation, right? Because you didn't have to go literally put in a new blade if you needed a new server or reboot something you had to instead just say, okay, I want a new VM. But you still had to install a whole operating system. It was like spinning up a new computer. And so containers made that even more simple because instead of having to do that, it's using the same shared underlying, kernel calls and things underneath the hood, but everything's isolated. And so if you want to spin up three different instances of an nginx server all you'd have to do is create three containers that are all running at the same time and those will all have the same specification that you basically baked into there. An immutable artifact. If you want to create a new one it has a new hash and a new version. So this was really good for people, but the problem is how do you orchestrate it across everything?And that's where Kubernetes came in. So Kubernetes takes that and says, okay, well we have a huge fleet of nodes. That's out there. How do I schedule each of these containers properly so that they're either not on the same place or that they meet certain requirements or that I have a certain number of them.It takes care of all that including some of the underlying networking connections so that you can connect all your containers to each other in a distributed and actually self-healing way it goes through loops. And if something goes wrong, it will try to heal that container and make it come back to a normal running state.And so it is a very powerful technology it's caught on... maybe it's caught on a little too much in some people's opinions, but it's a useful tool for containers that came about and a lot of people are using it for underlying infrastructure projects.Jeremy: [00:03:16] It reminds me a little bit of when you're using something like AWS and it has auto-scaling for VMs. And let's say that you have an application and it runs on virtual machines. And you would be able to tell AWS as more people are accessing my application I want you to create new VMs [and] run new instances of my application.Something like Kubernetes is able to do something similar, but maybe at a more generic level of being able to figure out. Okay. How many machines do I have access to? And anytime I'm asked to run something, I'll go and find the right machine to run it on. And it's always going to be in the context of these containers. Did I kind of get that right?Taylor: [00:04:15] Yes, it's very much a generic tool across all these different things. And nowadays you can run it with, with windows or Linux or mostly windows and Linux. But there are some limitations there, but yes, it's a very generic tool. In terms of being able to connect what you want and use what you want to, to make this distributed platform easy.And like you said it will scale things. If you've done something in AWS where you have the elastic things where they'll scale it's a similar thing. There's something in Kubernetes that you can set up that it will automatically scale to make sure that you have the right around right amount of capacity for what you're doing.Jeremy: [00:04:50] And I think this takes away the need for the developer to need to understand where their applications are going to run. It's. You give some kind of configuration to Kubernetes and say, here's my app. And then it just figures out, where to run it and how many instances to run and that sort of thing.Taylor: [00:05:12] Yeah, it's really meant as a dev ops or SRE kind of tool that instead of it being a. they have to like custom tailor machines. They already have these Docker images in place and can just run them, like you said. you just give it the configuration and it is still involved. It's not like a magic bullet there, but you don't have to care as much about where it goes if you've configured everything properly, it just kinda does it.Jeremy: [00:05:36] I wonder if you could paint a picture of, where containers fit in between just running a full machine versus running a single process on a computer.Taylor: [00:05:47] Essentially a container is just a running process, a single process. And because it's a single running process, you can run multiple of them on a machine. The tools that you need installed, all the binaries, all those different things are encapsulated inside of this container.And so that way you can run 10 of them on a machine instead of spinning up 10 servers. So it allows a little bit more density, and obviously you still have to be careful about what noisy neighbor problems and other things that normally happen in infrastructure, but it allows for much more condensed areas. And the spin up is much quicker. If you have a small image that you have to get getting a new one and spinning it up is easily under 30 seconds every time. bigger images can take longer, but even then it's still faster than provisioning a full VM for what you're trying to do. And so this fits for a lot of different kinds of services that, don't need like very, like large, large requirements on the system.And even if you do have large requirements, you can use Kubernetes in a way that can be helpful.Jeremy: [00:06:45] And so you're talking about how these containers are a process running on the system. So for example, if I were to in windows, look at the task manager or in Linux, just run ps, I would see individual processes for each of these containers that are being run.Taylor: [00:07:05] Yes. It really depends on the implementation, but essentially that's, what's going on. There's with Docker, there's some other underlying details, so we don't need to get into, but essentially you would see processes that, that are spun up that are, are doing the work, but they are just processes underneath the hood instead of being a full operating system.Jeremy: [00:07:23] Next, I want to talk a little bit about, WebAssembly, because I know that's an important part of the krustlet project that you work on. Could you explain a little bit about what WebAssembly is at a high level?Taylor: [00:07:37] Yeah, so WebAssembly, as you can probably guess from the name is a tool that was originally designed for the web. Now, that's I think the original creators didn't necessarily intend it to be that way, but that's the name that it has. And that's what it's used for. There's multiple, companies and, big websites use WebAssembly. And the idea behind WebAssembly is that you can create a binary that can then be consumed. So it's compiled code that can then be run in the browser. So giving you the speed of compiled code. While also, still being sandboxed inside of the browser sandbox environment.So this allows for some very performant things to be done. You have things like rendering tools. I know one that I've seen an example I always give is one. I like Autodesk, which they're maker of CAD tools and rendering tools. They have a lot of online things that run WebAssembly. Because it's, it gives it the performance.It needs to be able to do those more complex tasks. And so that's where WebAssembly started, but the idea is WebAssembly could be used anywhere. And when you pull out a WebAssembly from just the browser, it allows you to have kind of a universal interface and what this is called, they've defined it. And there's a working group and it's still very much a work in progress, but it's called WASI, which stands for WebAssembly system interface.And that interface is a definition of basically how it can interact with the system. And the nice thing about it is it still that sandboxed model, you have to grant explicit permission to do anything. So if you want it to be able to access a file somewhere, you have to grant access to that file ahead of time before you start it, It's not there yet, but I assume when we get to some of the networking socket support in there, it will also have those. You have to grant specific permission for it to do it. Whereas opposed to containers, you have a little bit a different security model because you're still running like into the normal Linux security things that you have to deal with.there's ways to break out of a container. There's not really a way to break out of a WebAssembly module in the same way. because. You, you're not, you're not running in these, like you're running a compiled code thing somewhere instead of basically like a shim over specific binaries or things that are being run.so that's, that's the main difference, and this allows things to be run on any system. one of the things that we, we saw with Docker and that we were really hoping when, when we, when we had that Docker was that it could run anywhere. But if we're being honest with ourselves, it can't really run anywhere Docker and containers in general are a Linux tool. Now there's people, well, who at Microsoft and elsewhere have made Windows containers, a thing, they work, they work well there's some really cool work they've done. And there's nothing. I have nothing to say against that. They're they they've done great work, but, but really if you have a nginx container, You can't run that on a windows machine.You can run it on a windows machine, but it's technically running a Linux VM behind the scenes, same thing on a Mac. And so it isn't truly this, ideal of write once run anywhere. Now, obviously there there's still technical challenges there. You can't do it completely, but you can come. So if I compile a WebAssembly module, a WASI compatible.WebAssembly module on my Mac. I can pass it to you and you can run it on a windows machine, on a Linux machine, on a raspberry PI. It doesn't matter because, and it'll be the exact same code. so that is a very powerful thing that we saw in WebAssembly that fits also fairly well inside of this container space and what we could do with it.Jeremy: [00:11:12] Help me to understand a little bit more about what WebAssembly actually is, because does WebAssembly itself have byte code or some kind of language that you're, you're passing to? a runtime, like for example, an example I can think of is. the Java virtual machine where people can write many different types of languages, such as, I believe Clojure and Scala, they can write a language that generates Java byte code and is run by the Java virtual machine.And so you can run those applications anywhere that you can run the Java virtual machine is WebAssembly similar? Is there a language for WebAssembly that is the target for say rust or C or different languages that you want to run in WebAssembly?Taylor: [00:12:06] That's a really good question. It is similar in the sense that it does write code that then can be interpreted anywhere and there are various WASM runtimes. The reference implementation that we're using within the Krustlet project is called Wasmtime. And that's the one that's following the Waze spec, and is essentially the reference implementation for the WASI WASI specification.And, that one can run on, on any of these systems that you've mentioned, and it has a byte code that it interprets in that way. And there's very, there's, there's a lot of technical details we could dive into there that I'm not a huge expert on, I know the basics, but there's some that are just like JIT compilers, right?So just in time, there's some that are like, compiled at, like a pre compilation step that can happen before. all of these things can happen but the thing is, is this is much more lightweight and small and constrained than the Java the JVM would be in this case. Right. So it's a, it's a much smaller and compressed use case, but it does have the similarity that it is byte code being interpreted by some sort of runtime.Jeremy: [00:13:09] And, and this, this byte code, cause you've been talking about how there are different, I guess, implementations of the WebAssembly runtime. You, you gave a Wasmtime as an example of that. Does that mean that as long as you target, the WebAssembly bytecode that any of these different runtimes could run that code?Is, is that how it works?Taylor: [00:13:35] Yeah. And that's why the WASI specification is kind of the thing that's making it work on the server side rather than just in the web is because we need those specifications for how to interface with different things on the machine. and so there's also this other thing called interface types, which also defines like how you can interchange different data types in between different parts of applications.To be clear. This is an area that's still under heavy development, for example, Wasmtime and WASI in general, hasn't even finalized their network specifications. So you have to kind of do work arounds and things to get net working in place. Now that's coming this, but like, this is very bleeding edge in that sense.Jeremy: [00:14:17] And so this bike code can run in the browser in whatever WebAssembly runtime is built into the browser. And then it can also run outside the browser in a runtime, such as Wasmtime. And it sounds like the, the distinction there is that. When you're running outside of the browser, you need some kind of consistent API to be able to access the file system, access network, things like that.which would normally be handled by the browser, but once you take it outside of the browser, you need some common interface that knows how to make system calls to open a file on windows versus open a file on, on Linux. And that's what a runtime like Wasmtime would do by implementing what WASI describes is that sort of, did I get that right there?Taylor: [00:15:10] Yeah, that's a, that's a great summary of how this works.Jeremy: [00:15:14] Cool. so one of the questions that, I think people often have is when you work with languages like rust or, or go, they can be compiled ahead of time, you can get a binary that you can run directly on your operating system, for languages like those what are the benefits of using WebAssembly to run an application versus just using that binary?Taylor: [00:15:42] It just depends on, I guess, the situation you're trying to do with it. Like there's certain tools, like the idea is what we have with, with WASM right now, isn't meant to replace Docker entirely, I think it opens up new cases for people who aren't, who aren't like already into Docker or need to, or have some specific things.But also this makes there, there's a certain amount of portability and security that comes from using WASM. obviously if you build a binary and then you build it for each system, you have the, the native things all built in all ready to go, which has a distinct advantage over some, like having to work through an interface, however, that security model is what gives us, a lot of hope for the future of this, because you have to explicitly grant permissions.and it's a compile once. I don't have to compile my windows version. I don't have to. So you don't have to worry about cross compiling tool chains or the different VMs that you have to spin up to build one for ARM and for Mac and for Linux and for windows and you know, all those, all the different targets.You don't have to worry about that with wasm you just build the, build the one binary and it's ready to go. it's also very, very small. If you do some, even without optimizations, you're talking like a meg or two megs for for a simple application. As opposed to a full binary, when, when you build it and rest binaries are fairly small, go, binaries are larger cause they've compiled in the runtime, but like rust, even though their binaries are small, this is even smaller. And if you strip it out, you can get it under a meg, depending on what it's doing. Now that size really matters with something like Kubernetes.So if you start with a container and you pull down an image, some of those images, even if you're using like the very slim things are still. 20 30, 40 megs. Now that's not a big deal, but there's some of these bigger ones, especially when people are doing, some of the bigger applications are close to a gig.Now that's not recommended. I know people say, but that's not a recommended practice. Yes. I know it's not. But in PRA and what actually happens in reality is people will do that. And so when you pull down, if you have a new version, even though they've cashed certain layers and things are the same, it's still takes a while to pull the new version and start it up.Whereas WASM is just these tiny modules. And even if we haven't done something super big, but I'm guessing that even if it's only, even if it's huge, it's only a few megs. And so then you're, you're pulling this down and this is very quick and very fast. And the added security benefits on top of that, where you're not having to deal with the same security layers as what is available inside of a container, that, that, yeah, very explicit grants on your security surface.Is a very powerful thing for us inside of Kubernetes on the server side.Jeremy: [00:18:29] And that security piece is a little interesting, is that when you're building the WebAssembly application that you're explicitly building in this application will have permissions to the file system at these paths, or kind of what does that look like?Taylor: [00:18:48] So it's not built in at compile time. Now there is one. when we get talking a little bit more about Krustlet, we have another implementation and there's one that, was started by capital one called wascc. which is WebAssembly, secure capabilities, connector. There's lots of W's here in this space.And so this, this wascc runtime actually does things called capabilities where you can, explicitly you're actually supposed to explicitly say, I need this capability and I'm signed to have this capability, but WASM by default and the WASI spec, you grant it at runtime. You say, I'm going to give you access to this file, or I'm going to give you the permission to do this thing. So those things are done at the very beginning of your runtime, not when you're compiling it.Jeremy: [00:19:33] Hmm. Interesting. And then, so that would be some kind of configuration file, maybe that would say, okay, when you run, this WASM application. Only give access to these permissions. And like you were saying before, that's much simpler than something like a Docker container where your permissions model is actually based off of a Linux operating system.Taylor: [00:19:57] Yes. It's much simpler in that sense. I mean, you're still, there's, there's more overhead involved if you were to spinning this up completely manually, you have to say, okay. I need to give access to this directory and I need to give access to this thing, but Kubernetes kind of takes care of that for you.And we do that inside of Krustlet with, with what we're doing to kind of abstract some of that away. So you don't have to do it. So, I mean, it's, it's like most security tradeoffs, right? Like the most secure thing is a server. That's not connected to the internet and is inside of a locked room, right? Like that's, and you can only access it in, in that room with a badge.Like that's, that's like the most secure you can get, but that computer is kind of useless. so that's the idea behind, the, the security trade off is now you're just being explicit about what you're granting instead of just having all these implicit things of, Oh, I can access the network and I can access this thing and I can access this thing, which comes by default as running a process on a, on an operating system.Jeremy: [00:20:50] When you were talking about being able to run the WASM application anywhere, that sounds like. benefit because normally when you have a build server, for, for example, an open source project, you'll see that they have to target all these different operating systems. They may have to build their application for, six or eight different OSS.And in the case of using WebAssembly, they could just build it once and it would technically run on any of those. It sounds like. We've been talking about WebAssembly on the server. before we get into how Krustlet runs WASM applications, if I had a WASM application and I just wanted to run it on my machine, what does that look like?Is there some kind of package manager or, how am I running these applications?Taylor: [00:21:43] So there's a couple of different ways that these can be shared around. There's no, specific. implementation right now. the thing that we started with, with inside of Krustlet itself, as we use the OCI specification, which is the exact same thing that the containers use, it's just storing as a different artifact type.Uh, there's some work by the people, working on WASC, they have something called gantry. and there's a couple other people who are looking into how you're supposed to store modules. So the idea behind all this, we have to figure out what exactly want to do. So it's kind of a little bit in the air. we have a tool that was built by somebody on the team called WASM to OCI, and it does the work of pushing that to a container registry that supports Arbitrary artifacts. And so what you can do is take that and pull that down and it just pulls down the compiled module file.It's always something.wasm, and then you can use whatever runtime you want to use to run it. So you can download wasmtime and install that you can down. You could do a there's Wasmer, there's Wasm3, which is for kind of optimized for embedded devices. Those are the kinds of things that, that you can do.Just, you have to choose the runtime that you want to use. And then you have to grab the module from somewhere, wherever that might be right now, which is still, like I said, a little bit of a loosey goosey kind of thing.Jeremy: [00:23:10] and, if I was just in the context of running an application on my computer without involving Kubernetes or anything like that, would it be where I get a single file? That's the WASM application. And then I pass that into say wasmtime or something like that, just at the command line. And that's how I would run it.Taylor: [00:23:32] Yeah, that's exactly how it would work locally. Now, obviously we're, we're working on building us to make it even more fully featured. The ideal would be that then you can have an application that can be easily swapped across machines. I mean, imagine if you had something like notepad. Or some sort of editor and then you could do it there and then just take that same application and then have it somewhere else.without worrying about what kind of system it is, you could have your raspberry PI 4 running a random desktop somewhere, and you could pass it over to a server and, do a virtual desktop session. Like you could do anything crazy that you wanted to by passing that around.Jeremy: [00:24:09] cool. Yeah. I mean, it's, it's this idea of having a truly universal binary, I guess, having the ability to copy this application anywhere and run it without having to worry about, I guess basically anything you just need to have a wise and runtime.Taylor: [00:24:26] Yeah. And that gives it some quite a bit of power. Right? Cause you could jump from an edge device. I'm using edge in the loosest possible term. Cause it could mean anything, but like any type of edge device, all the way to. I server I'm running in a data center. It can be anything along that whole spectrum of different servers that can run this, any can pass it around.And it can, you can even start to think about how you could maybe hot swap it, right? Like you could point at one running implementation of it. And then when you don't have access to that, because you lose internet, you could point it at a local instance of how to run this.Jeremy: [00:24:57] very cool. now maybe we should go a little bit more into Krustlet. I know you've talked a little bit about what it is, but maybe you could go a little more into detail about what Krustlet is.Taylor: [00:25:11] Yeah. So Krustlet stands for Kubernetes Rust kubelet. Lots of Ks. The main idea behind Krustlet was we wanted to create, we want it to reimplement the kubelet, which a kubelet is basically the, the binary that runs on a Kubernetes node that connects it to the cluster and runs it and joins. So that's, that's a kubelet and. we wanted to write it, write it so we could do WebAssembly. And now the reason we wrote it in rust was for a couple reasons. Number one, since you're probably listening to this because you like rust and you do rust, rust has probably the best WebAssembly support for, for server side things.most languages at this point actually have WebAssembly support, but it's mostly geared towards the browser, but, WASI is something you can easily add in. It's actually, you just use rustup and do rustup component or rustup target add wasm32 dash WASI. And that's how simple it is to start compiling for WASI compatible WASM binaries in rust.and so that is a, very powerful and useful tool to have around. If you can do it that easily. If you look at some of the other examples like C or C plus, plus you have to, customize clang properly or download the CA the already preconfigured tool chain and use that clang and then also you have a couple of other languages that support it, but rust was fully featured language that we could use it with, but also rust has caught our attention for a while, just because of its application inside of distributed systems and compute, like normally it's looked at as a systems engineering language, but can we use it in these cloud applications, this idea of cloud native as the buzz word right now? can it be used in a cloud native way? And so it was a secondary goal here was to prove that it could be, could be done like that.Plus you add on all of the safety and security and, correctness features inside of rust. And it really helps us out. So we wrote several blog posts about that, that I can probably send around or, or link around if you send out show notes. but the main idea is that pre, it has prevented us from shooting ourselves in the foot.Go has really easy concurrency things. There's sometimes that's one of the things that I most miss about using go for, for some of this is if I want to do concurrency. work. It's quite simple to set up, it's built into the language. It's very simple to do. But the thing is, is that there are bugs that we got caught where we'd sit there and be like, where everyone gets mad and like, why are you getting mad at me borrow checker, like, well, I've done everything. And then you're like, Oh, like it's caught that down the line. I'm going to, if I were to do this, I'd have two people trying to read the same data and that has been very useful and exciting for us because it's stopped us from doing those things. So the guarantee that when your rust code compiles, that it will be correct, even if it's not necessarily the right code, it's at least correct.And you're not going to have weird database access issues. Inside of, of your code. And so that was the secondary reason that we found it was, it's just very powerful for doing that. And it also it's, it's expressiveness with traits and how generics work gave us a good deal of flexibility inside of Kubernetes.having dealt with both extensively with both the. Kubernetes rust client and the Kubernetes go client, the Kubernetes rust client, it's much more ergonomic due to how traits work and generics. And so that was just something that we really enjoyed coming over. But it also, like I said, the main thing was that it had such good WASM support already built in and really most of the WASM runtimes are being written in rust right now.So it makes sense for us to be in this space. So that's why this project was created. was to do is to have the ability to easily run, rust things inside or easily run WASM inside of Kubernetes. So that's, why we used rust and why we came up with Krustlet.Jeremy: [00:29:07] Yeah, that makes a lot of sense because it had reminds me of, I had a conversation with a Armin Ronacher in an earlier episode and he works with century on different debugging tools and things like that. And the reason why they chose to use rust in parts of Sentry is because there are a lot of existing tools or a lot of existing crates in Rust related to compilers or related to, reading debug files and things like that. And so in your case, the, the community had already done a lot of work in WebAssembly, and that's why it made sense to choose rust. So I think it's interesting how there's these certain niches that people have built up and, continue to cultivate and continue to bring additional projects in because of that, that base it's, it's very cool. You had mentioned a little bit earlier about how rust has really good support for WebAssembly. And so it's easy to get something up and running in WebAssembly using rust. but you had also mentioned with WASI that there are only certain features of the system that are implemented.Like you had mentioned that there aren't network calls implemented yet. And I wonder when you're writing a rust application, do you have to do anything special when you're trying to target WASI, for example, if I were to make a network call in my rust application, and I try to compile it to be run in WebAssembly is the, the command line tool going to tell me you're trying to use an API that, that doesn't exist.Like how does that part work?Taylor: [00:30:52] that right now is a very rough edge. It just depends. I haven't tried to direct compile in with a network call yet, but. Most of the time, if you're doing something, that can't be compiled, the co the compiler will go in and say, like, I can't find a linked thing. Like when it's trying to link and do things, like, I can't find anything that links this together or that I'm able to compile this in and it'll spit out there.And sometimes it's very obtuse. It just depends on what it is. This is, like I said, it's a very rough edge right now. because it is so new, when, when you are compiling, but for the most part, you actually write things the same way. There's some things that you might need to pull in, to make sure that you don't, do something incorrectly or that you've had the correct things attached to the data structure or whatever it might be for the specific case.There's actually some good examples inside of like wasmtime and a couple other places. But for the most part, you write code pretty much how you just normally would.Jeremy: [00:31:45] so it sounds like currently, like you said, you pretty much write code as normal. You run it through the command line utility, and then you may get a helpful error message or you may get one that's really hard to decode and then you basically start digging around the internet, trying to find out, what might be wrong.Taylor: [00:32:05] Yeah, that's kind of how it goes. Now. Luckily, people are very responsive to this, in, in the community. but we're working with them to, to work around this and there are possibilities of using networking. So, the wascc examples. That's one of the reasons we chose to use wascc is because it has networking support built in.Now, the way it works is kind of just working around the problem. You have a capability that's built for the native system. And so if you're doing it on a, on a Mac there's, you can either load it from like a dylib file or like a, an object file of some kind, or you can have it compiled in, which is what we do in Krustlet.And so when we build it for windows, It gets that compiled part for windows when we build it for, for Mac, it's going to have that compiled thing in there. And so like it's just working around it by creating a component of the system that is then linked by wascc to be able to talk forward calls around.And so when something calls the networking thing, that call gets forwarded to the WebAssembly module, which handles it and passes it back out. There's not actual networking. Inside of the module itself. And so it works around it. It's a bit different of a model. It's an actual better model as opposed to, the wasmtime implementation and Crescent, which is more of a working like a standard container.I use that term very, very loosely. It's more like I have a process I'm running that process as, for as long as it wanted to. And then it's done as opposed to responding to specific like action or actor calls that come in from a, from a host.Jeremy: [00:33:38] so wasmtime and, and WASC are two, different WebAssembly run times. when you're talking about these processes running as actors, I guess, would that be. This would be a case where if you want to run a number of processes and you want them to communicate with one another, that's when you would choose wascc.I'm trying to understand when you would use one runtime versus the otherTaylor: [00:34:04] Yeah. So right now, like if you were to go download Krustlet right now and try it, basically you have, if you were going to be using Wasmtime, the only way to communicate is by like data on files. Cause it can access files. So you could mount a shared volume and then like pass it off so it could do data processing, but it can't do communication very well until we get the, until WASI finalizes, how it's going to do networking.if you want to do a full thing with WASC. So the thing with WASC is, you can use it entirely out of Kubernetes as its own thing. It's almost, it's, it's very similar to like a functions as a service kind of kind of tool. You write your things in whatever language, compile them to a WASM module. And then it handles connecting all those things together.And the capabilities model around it gives it a security, an additional security layer that you have to, things are signed. So all your modules have to be signed. And then you have to, say that it has access to specific capability. So whether it can access another capability that another WebAssembly module is exposing or whatever, you can glue all these together so that you could pass calls around.and it also has actually a, an ad hoc networking tool called lattice to connect multiple nodes together. So it can run entirely outside of Kubernetes. but it also has a bunch of tooling and things around it. So when you're going to do it, you're buying into that system and you have to know that that's what you're doing.Wasmtime sometime is meant to be, like I said, following that same WASI specifications. So as soon as WASI gets networking and then we'll implement the networking stuff, just because we want to make sure that we have the more, like here's the vanilla option, how you glue it together. If you want more of this like quick functions kind of thing, then wascc is an amazing tool.And so we implemented that because we've been, we've been collaborating with them for a long time. and so we've, we've had that implementation there so that it can be show another way wascc can be used while also helping people who want to try this right now and hack around to do things with networking.Jeremy: [00:36:00] Does that mean there is some specific API call, I guess, that you're using within your WASC actors that allows it to communicate with the other actors. but it's not like general network calls. It's, it's a very constrained API. Is, is that right?Taylor: [00:36:16] Yeah, there's an underlying thing called they call wapc yeah. WAPC so it's a WebAssembly protocol. I can't remember. Basically. It's a, it's the protobuf of the, this world. it's a message protocol. Here's how I'm sending a message back and forth.And so each actor is what it's called, is a WebAssembly module that can respond to specific events. And so it registers specific event handlers for those events. And when the, the underlying host that's running all these, gets it it dispatches those events to the actors and to, to run.Jeremy: [00:36:49] and so if you were using that with Krustlet then, the, the way that the actors communicate with one another or communicate with the host, that would be configured automatically by Kruslet I guess, or, or what is the, Taylor: [00:37:04] To an extent.Jeremy: [00:37:05] Okay.Taylor: [00:37:06] That's why I was saying that it can be used more fully featured outside of the Kubernetes world. but it has a distinct place inside of Krustlet as well. And so Krustlet will do some of the configuration to a point like it makes sure all the capabilities and stuff are configured, but we're, for example, we're still trying to define, well, if somebody wants to add other capabilities, How do they define that?Which normally you do, there's a freestyle block inside of a Kubernetes configuration called an annotation. So we're thinking, well, do we do it with an annotation or do we do it with another tool we don't know yet, but right now it does that base configuration link of it. It'll set up, make sure you have like an HTTP server access and that you have access to do logging and that you have access to do all those things.It sets up for you. But we have to still define a way of what's the, what's a safe way for a user to say, I need this capability.Jeremy: [00:37:56] When you've been developing Krustlet you've, you've mentioned how there's a lot of existing WebAssembly, capability in rust or projects, built in rust. are there other parts of the rust ecosystem or specific crates that, that helped you speed up your development a lot?Taylor: [00:38:15] Yeah, there are, in terms of development tools, I have really loved, Cargo expand when they're doing some of the macro stuff. Async things do a lot of like additional, macro things that like build stuff out or, or wrap things in. So it's kind of nice to see that. I also learned about a bunch of different tools around how to debug stack overflow's because we were accidentally pinning some stuff too early and it was causing a stack overflow that we didn't discover until windows. because there's a smaller stack and so that, like, there were some really interesting things on the nightly compiler, like how to print, type sizes, that was really helpful in identifying things that could have gone wrong. and looking at like our other tools we've really enjoyed, the Kubernetes crate for it. It's called uh Kube. And that one has some really awesome, awesome tools in it, that are, very helpful. I think a lot of them are things that people have heard of ser serde, or sir, I can never, I feel like that's Jeremy: [00:39:17] Right.Taylor: [00:39:17] that's a constant debate in the rust community.Um,Jeremy: [00:39:19] I, yeah, I thought it was Serde, but I don't know.Taylor: [00:39:23] And I think it's Serde because it's serialize, deserialize, but anyway, yeah, so that's there, that we've used a lot of, and has been very helpful. obviously the tokio runtime has been helpful for us as well. But really like, it just depends. Like if there's not like any other crazy tools we've used outside of it, I do have to say a huge, a huge thanks to those who are, those who were working on, like the Kube crate and, and some of these other things we've, we've been able to contribute back as we found bugs, and other stuff.I love the rust rust TLS crate as well, which has been very helpful for windows because then we don't have to have an open SSL dependency. and so that those are, those are the different things that have been, I guess, really helpful to us as I've like looked through and seen all the different things that we've done in our, in our code. Jeremy: [00:40:11] At the start of the episode, you mentioned you're relatively new to Rust, the languages you had used previously. would that be like go or what are some of the other languages you have a lot of experience with.Taylor: [00:40:23] Yeah, I am a, re stationed by way of go. I've kind of like moved all over the place. so I've done a lot. I mean, I've done a lot of Python, like for glue code when I've done some more SRE work. I did node, back when it was starting to become a thing. Right. And then moved on to go, and then we have.and then I've been doing some rust, since then, so yeah, it's just, I come from go, that's my main background before this was a lot of go because I was in the Kubernetes space, really heavily, well, I still am. And so that, that's where I got my go experience from. So that's my, that's my background where they're coming from go to rust is really the current thing that that's happened.Jeremy: [00:40:59] And since you, you have experienced with go, I'm wondering, are there things that you miss from go either in the language itself, the runtime, or even just the ecosystem that you wish that rust had?Taylor: [00:41:16] Yeah, there's a few things we've run into here. I overall, I am very pleased with, with, with the rust, and would choose it for a lot of different cloud projects, depending on what you're into. I think. Go, this is totally personal opinion here, but I think go is very well suited for smaller, like true microservices or anything.Similar, very, very small constraints tools, because it's quick to get started. It has such a constraint vocabulary. And one of the things I appreciate particularly, and some people hate this and some people like it, but I like that generally there is a way to do something in in go. There's sometimes two, sometimes three, but most times there is a way to do it.And when coming to Rust, there was 40 different ways to do the exact same thing. And so that was, that was something that I missed. The other thing I've said is that I really, I kind of mentioned, I said before that the concurrency story is much better in go and now it doesn't provide the same security or not security, correctness that.Rust provides, but it's so much easier to get started and to spit things back and forth. And it's just been, as a relatively new rustacean, I kind of get frustrated by the fact that there are two, three async implementations.Jeremy: [00:42:40] Tokyo, async standard.Taylor: [00:42:43] Yeah. Those are the two that I know. I think there was one more right?Jeremy: [00:42:46] I think there's uh smol, I think it's how you say it.Taylor: [00:42:49] Yeah, SmolJeremy: [00:42:50] I don't know.Taylor: [00:42:50] It's supposed to be like a tiny yeah. But yeah, it's so, and I know that's a common complaint and people have done a lot of work on those, but it's very frustrating because you like get bought into that specific implementation and it's like, let's choose one and make it part of the standard library, or make it the blessed crate or whatever it is just so we can all standardize and not have to like have weird hook ins to each run time and. all those those kinds of things, so that that's been fairly difficult to, to like work with sometimes, most of it's just otherwise like ergonomics.I know that, we're going to be writing a blog post on this soon, but inside Krustlet we were, doing a state machine graph. It was more, it's kind of like a cross between like a traditional state machine and uh walking, a graph, that we were doing to be able to. Encapsulate the logic inside of, inside of Krustlet a little bit better.Cause they were turning into monster functions that were just ridiculous. And because of that, we started to run into some of these things in go, you have the, you have their interfaces, right? And these interfaces, you can just keep calling through and chaining through, but because of how the, the type security that comes through, Through the trait system and things here in rust, it made a really difficult to find a way to just iterate.And we finally found a way around it. It was a little bit, a little bit clunky and well like I said, we're going to write a blog post on it. Kind of explain everything, but there's just some of those things where the boundaries rust puts on you make some things very difficult to figure out, to make it work in a way that's both rusty, but also readable, to someone trying to write it.that was one of those things. Like, I just see some of those rough edges sometimes that just, and I'm not sure if there's a way to solve that. That's that could just be me airing my grievances. But like, that's something that I miss. There's a little bit more flexibility that comes from the go model, with some of the things that we've, that we've run into here.But like I said, I, I feel that that's worth it, given the security and things that we've gotten in return.Jeremy: [00:44:49] so you mentioned one of the things with rust is that there's, there's so many different ways to do the same thing. And as you were learning the language, I wonder what was your approach to figuring out what is the ergonomic way to do things? And I guess just how to pick up rust in general.Taylor: [00:45:09] all right. It was a combination of Clippy. And I mean, everybody loves Clippy. Some people get really mad at Clippy, but like Clippy at least tells me like, Oh, Hey, like, you're right, but this, you could do this more efficiently or you could avoid an extra allocation or you could do all those, those kinds of things.learning to read the compiler messages. You get trained from like reading other compile. Like when other things fail, you look where the compile error failed, what line? And then you go there. Because normally it doesn't give you very much information, but with rust, you have to go through and read like, Oh, okay.It's telling me this went out of scope here or was passed, passed here, and you need to go do this, or you need to go do this, or you need to go do this. all those things can be, can be very useful when you're reading a rust comp, like rest compiler error message. So those are the other things.The other thing I had, and this is. I think one of the bottlenecks is that I had to go to an experienced rustacean who was working like a couple of experienced ones to go get the help. And there, that's what I love about the rust community. People are very willing to help. There's like the rust mentors page.I know was mentioned at rust conf recently that I hadn't heard about. but the big problem is that that's still kind of a bottleneck. Whereas with other languages I've been able to find like at least semi clear examples of. This is a good way to do it, or this is how, this is how these things are handled.I mean, an example of this is to string (to_string) versus to owned (to_owned) which technically they almost do the same thing when you're doing it with like, from, is there a proper name for ampersand string? Like just the, like the string slice, right. those things, the, to own versus to string.Really what it turned into is there used to be a difference, but then there was now it's just clear. Like I just need this as a string, then I use two string. But if I'm using it in a case where I, I have one, but I need a cop an owned copy then I used to own (to_owned) , even though they're I think they, at this point they call the same underlying logic.And so learning those kinds of things I had to learn from people I didn't there wasn't like something could say, Hey, like here's the history of this? Or. even in the documentation, which rust's documentation of the functions is really good. It doesn't say like, you should use this here or here there there's no specific suggestions. And so, if there's one thing that I hope we can improve is maybe how we can document like the intermediate level cause getting started there's stuff, but then otherwise you're kind of bottlenecks at specific, like asking other rustaceans, Hey, how do I do this thing?And that makes it a little bit harder for people.Jeremy: [00:47:40] I'm not sure what the, what the solution for that is. Whether that's, like you said some other intermediate guide or, but then again, it's kind of like, How do you, determine what to put in there and how do people get directed to that versus the relationship you're talking about, where you're, you're talking to experienced people and they just have all that context in their head and they can tell you, it's a hard problem to solve for sure.Taylor: [00:48:05] Well, yeah, and like I said, the compiler's fantastic. A lot of times you like learn from the compiler messages and that's how I finally learned that. Um when you give a static constraint on a trait that you're not saying it has to be static, it just has to fit into a static, which like, I was like mind blown moment right there.And I was like, Oh, that's cool. Okay. I get it now. But before I'm like, I don't want to make this static. That's going to like bloat the size of this. And I like, I can't have this allocated on a stack. Like, this is huge, but it's like, no, It's just saying this needs to fit in a static. And I'm like, okay. And I learned that I think from a compiler, I could be wrong, but I think that the compiler said this doesn't fit in something or whatever.And I'm like, wait a second. And I, and I dug into it and found it. So I think that a lot of the work that people discussed this at rustconf about making the compiler, just kind of guide you through things and anticipate it is quite impressive. And I'm very, very happy with that. So maybe that is that intermediate way that eventually get there.But I'm just saying that I think that the there's some tooling there, but when people get in that, that learning curve is just a little bit, I like to describe it as logarithmic, right? Like it's just, you have that initial punch over the top.And then once you understand that you can be quite capable of producing things quickly. Jeremy: [00:49:17] You were talking about some of the. Issues you ran into or roadblocks where you needed to get more intermediate help or talk to people more experienced. I wonder when you first started, how did you first start learning Rust?Did you go through the, the breast programming book? Did you start making little small projects? I'm curious how you approach that.Taylor: [00:49:42] Yeah, I did a combination of both. I tried to do some of the like rest by example, the rust book, just some of the basics. And then I tried implementing, I tried reimplementing some parts of, of, different projects that had in the past or things I just wanted to try.And then I went on with trying to like actually implement in a real project, which was Krustlet. And you can see if you actually look at the Krustlet code, you can see where like, Oh, they must have been new there and we've been going through and cleaning that up as we go along. But having that real project and something I could like go towards, that's how I learn best.It's just having like an actual goal of either reimplementing something or, Or, or finding like an actual project to go, like dig into. And that's how, that's how I work, but it's a little bit different for every personJeremy: [00:50:33] Was there a moment, I guess, where you felt like you were really struggling and then once you pass this point that, rust clicked for you or was it, did it feel pretty straightforward as you were going through the process?Taylor: [00:50:47] it was a little bit gradual. more than like a specific moment where I was like, Oh, everything clicked. I do have those moments. Like I mentioned before, like when I understood what, like a static constraint on a trait means, like that was like, Oh, like mind blown. I finally get it. but it was more of a once I started like actually doing like more complex traits or trait bounds.I think what I finally felt I was getting it was when I could do something like that. Where T equals this plus or T this plus this and, N is this plus this, like with the like long constraints at the end of a function and like understood what it was doing and why I did it that way. And I think also a combination of, of implementing some of the traits, like as ref, as mut ref those kinds of things that I could pull out or like convert, or have a wrapper type.And I'm like, okay, I'm finally getting how all this glues together. that's when I that's, when I noticed, I think, okay, like I can, I can do this now. Like I can put together some, some cool things. but yeah, it comes, each thing comes with its own accomplishments. Like I had mentioned before, we had the state machine that we've just finished and we're cleaning up right now.And that, that state machine thing was like, okay, like we finally got something that worked. I think it's recently where I felt like, okay, I feel like I can actually be a good contributor to the community and maybe even start mentoring others properly because I have the knowledge to do it because now, now we create something new that as far as we can tell, like, people haven't done something like this in rust before, outside of just toy things.And so like, we're, that's why I'm excited. I wish we had had like that today. So I could say, Oh, here's this blog post, but I'm really excited for that, because we're going to talk about like all the work we built on from people in the community who had posted about it and all these things in it. Okay. We managed to get something that works.It's still like has rough edges. It still has things, but we've got something that works well for us. And so I, I that's, that's been fairly recent. And so I think there's just moments where like, I keep understanding more, but for me it was more of a gradual turning of like, Oh, and then all of a sudden I realized like, Oh no, I think I've gotten to the point where I know it, it wasn't like a click. Like I know what it is. Oh, I just realized I actually know what I'm doing now.Jeremy: [00:52:56] Yeah. Yeah, no, that, that makes sense. You've been mentioning how Krustlet, has some rough edges.And I know on the project page, it mentions how it's highly experimental. What do you think are the, the big parts that are, that are currently missing for somebody who would want to go in and actually host their application using Krustlet?Taylor: [00:53:16] well, one of them is completely outside of working ends, which is networking, in Waze. we're, we're trying to work with the community to do that. And we're going to see if there's a way we can only solidify and jump in and work on that. but we're getting close. So, this is no one can hold us to this, but we're hoping to get towards a 1.0 Release towards the end of the year, beginning of next year.and so like around the holiday season, things will slow down, whatever. So that's why we don't know it could be January, February, and that's what we're hoping for. And really the big things that we have are, we have basic volume support. but we don't have cloud volumes support. So we're going to be looking in a way of how we can make sure every provider can, can use this.We're trying to solidify the API at the same time with that, because we have people who are writing other providers and a provider is just something that is an implementation of a runtime. And so we've written ours in for WASM, but we have another person who's a core maintainers, his name's Kevin.and he's been, recently made a core maintainer of the project as well. And he's working on one for containers. so he's just moving the container implementation stuff over to rust because of all the security benefits and things. And so we need volume support for all the providers, and then we're going to try, and then we're going to figure out a way we can abstract the networking, probably using the same interfaces that Kubernetes has already defined, so that we can have networking implementations more, more readily available and connected into these things from, the rest of the Kubernetes world.And then after that, we need to have like a real demos like we have demos, but we need like some, like, I want, like, here's a real application as far in so far as you can make it real and take that and put together, some bootstrapping things. So it's easier for people to set it up. We want to make it as one click as possible to set it up.And so. Those are kind of the things we're looking at and that people can, can look for rough edges. But if you want it for it, for example, if you want it to trigger like data pipeline processing, you can use just wascc or you can glue together wascc and a WASI provider, which is the wasmtime one. You can glue both of those together or have two of them running and you can trigger a data chain using an HTTP call and then process the data using one.You can do that right now. we had, an intern on our team over the summer and she did some work with, on raspberry pies using a raspberry PI Kubernetes cluster, and then using Krustlet to read soil sensors. You can do some fun things with it. It's just not all the way there. And that's partially because of where things like this is, this is bleeding edge, and that's why we put like the big warning sign on the reading.Like, like, please don't run this in production. Like you're you're this is so new. Not just the project itself, but also the technology around it. And so that's what those are kind of like the steps we have. So, at this point I would say, I maybe would remove the highly from Krustlet's description Hm I, if I was really wanting to, because now it's just, this is an experimental project.That's kind of the goal for the future here, but we're not that far out from having something more it's a 1.0 where people can actually people can start using it in a real way and maybe not perfectly, but in a real way.Jeremy: [00:56:27] Last year, Solomon, the CTO of Docker, he had tweeted that if WASI and WASM had existed in 2008, then they wouldn't have needed to create Docker. And I'm wondering from your perspective, thinking about the future of, WASM and Krustlet, do you think that.Running applications in WASM could become the default for, for server side applications in the future.Taylor: [00:56:59] It's a possibility. I wouldn't peg it entirely for sure. I, I think it will be a mix. People are just like getting on board with Kubernetes stuff and containers, which sometimes like, like I said, I think some people go way too. Far into it and don't think they just like, Oh, they hear Kubernetes and buzzword and want to do it.But people are still just barely getting to that thing. So it'll be a long time if it does become the default. But I think it has the ability to, reach a very specific audience right now and have that grow. a lot of these constrained environments, like edge computing, you can't run Docker on there it's too much, too much overhead.They just can't handle it. But you could run, WASM modules. it also allows you to pack things in more tightly because if everything's a small process, that's just this tiny little thing and tiny little binaries. You can run that with. we can run a lot more than you could with containers right now.So I wouldn't say that it's going to, it's not a container killer, nor is that our current goal. Like we didn't, we didn't think that, like, we don't want to disparage that other technology. That's, that's something we still use a lot and effectively. but I, I do think that there is going to be some takeover of that space, at least in a small measure with all this stuff from wasm and WASI.Because of its just portability and the ideal of we'd be closer to a write, write once compile, once and run anywhere kind of situation, people say it's a pipe dream. I think we'll never get completely there. Even with WASM, there's still constraints. There's still things that will be in place, but this makes it easier.And having, if, if every language gets to the point where you can do it with rust, where you just say, here's my Wasi target and build it. That to me sounds like a very powerful way of doing it and not just for server side. I think that can reinvent a lot of things with normal applications as well. Just because of how portable they are and how then, applications could be tied to you instead of just like being tied to a computer or whatever it might be.So there's some really interesting ideas here. It's just. There's those, those ones are a little bit further out, but I do think even in the short term, we'll start seeing some good applications where WASI will be WASI, compatible WASM, binaries will be a better choice than using a container.Jeremy: [00:59:17] and it sounds like maybe in the short term you were talking about edge computing and that might be something like where you have a CDN, running application code. At their, their edge nodes, something like Cloudflare's workers or Fastly has an equivalent. are you thinking that might be where, where these things start?Taylor: [00:59:39] I think that's where it's already started in one sense. And that's one of the reasons we chose Kubernetes is we think there's more beyond Kubernetes and server side on my team. That's that's our belief that we believe there's more to that, but everybody is getting into trying to do Kubernetes and have this Kubernetes has become an API layer that people understand that a lot of people use and enabling WASM through that API gives people a reason.We want people to start using this and say, Hey, like, why doesn't my insert language of choice? Have the support for WASM. I like WASI binaries. Can we please get that? And then as people do that, we start getting more motion around it.Jeremy: [01:00:19] I know when. I talk to people about WebAssembly a, sometimes what I'll hear is they'll say, well, I'm happy writing JavaScript in the browser. Why do I care about WebAssembly right. And so, like you say, if there are more use cases for running other than, just in the browser, then that might inspire other languages like Python or Ruby, or who knows what other languages too, to focus on getting them to work on WebAssembly. So I think that's, that's pretty exciting. so I think that's a good, good place to start wrapping up. if people want to learn more about Krustlet or, about what you're working on, where should they head?Taylor: [01:01:00] I would definitely start with, the actual project site. So that's deislabs/krustlet on, github. there is also some, some posts that we have in various places. I wish there was like one amalgamation of all of this, but, you can look at the, it's deislabs.io/posts. I believe.Let me just double check that.Jeremy: [01:01:23] and we can probably get the krustlets specific, posts and then put those in the show notes as well.Taylor: [01:01:28] Yeah, and I can send those, but yeah, it's deislabs.io/posts. that's posts for all of our projects, but you'll see some, at least three blog posts there about Krustlet. one was around our, our stack and heap allocation problems that we had and some lessons learned there. so those are, those are some other places you can go.We have some other posts that hopefully we can send in the show notes that kind of give an overview of the different things that the reasoning behind this, if you're more interested at this from a high level, like a business perspective, we have a post for that we have some other things about the security things we got from it that we've posted around.So, there's, there's lots of sources of information there, but if you really want to get started and look at the project and install it and try it out, go ahead and check out. deislabs/krustlet on get hub. And that one will have the docs and everything you need to get started.Jeremy: [01:02:19] Very cool. Taylor, thank you so much for talking to me today. It's been interesting learning about Krustlet and WASM. Kubernetes and all of that. And I think it's going to be very interesting to see where it goes in the future.Taylor: [01:02:33] Well, thank you very much for having me. And hopefully everyone finds at least some of this interesting and useful.

Sep 9, 2020 • 1h 9min

Building the Lucky Web Framework in Crystal with Paul Smith

Paul Smith is a Software Engineer at GitHub and the creator of the Lucky web framework. He previously worked at heroku and thoughtbot and has experience building applications using Rails and Phoenix. He's also the creator of the Bamboo e-mail package and the co-creator of the ExMachina test data package for Elixir.We discuss:The tradeoffs of object oriented and functional programmingHow a lack of compile time guarantees slow down ruby and elixir developmentCreating conversational error messagesWays fast languages can change how you write applicationsWriting templates with Crystal instead of HTMLChoosing what to include in a web frameworkThe Crystal community and ecosystemRelated Links:@paulcsmithThe Crystal programming languageThe Lucky web frameworkHTML2LuckyAlpineJSThis episode originally aired on Software Engineering Radio.Transcript:You can help edit this transcript on GitHub.Jeremy: Today I'm talking with Paul Smith.Paul is the creator of the lucky web framework and he currently works at GitHub. Today, we're going to talk about the crystal programming language and the lucky web framework. Paul, welcome to software engineering radio. Paul: Thank you so much. Happy to be here.Jeremy: There are a lot of languages for software developers to choose from. What excited you about crystal? Paul: Yeah, that's really interesting because when I first saw Crystal, I actually was not interested at all. it basically looked like Ruby to me. And so I just think, okay, so it's a faster Ruby. And typically if I want to learn a new language and want something that feels really different, that pushes the boundaries on things.I started getting more interested in compile time guarantees. I worked at thoughtbot previous to github and previous to Heroku and people were starting to get really into typed languages. Um, some people were starting to get into Haskell, which is like, you know, the, the big one that, I guess is probably one of the more type safe, but also hard to use languages.Um, but also Elm, which has a good focus on developer happiness and productivity and explaining what's going on. And as they were talking about, how they were writing fewer tests and it was easier to refactor, uh, it started becoming clear to me that that's something I want. Um, one of the things somebody said was, if the computer can check the code for you let the computer do that rather than you, or rather than a test. so I started to get really interested in that. I was also interested in elixir, um, which is another fantastic language. I did a lot of work with elixir. I built a library called bamboo, which is an email library. And another called ex machina, which is what a lot of people use for creating test data. Um, so I was really into it for awhile.And at first I'm like, wow, I love functional. And then I realized like. I can do a lot of, like a lot of the stuff I like about this I can do with objects. I just need to rethink things so that it uses objects rather than whatever random DSLJeremy: Cause I mean, when you think about functions, right? Like you've got this big bucket of functions and you got to pass in all the parameters right? Whereas, you know, in a lot of cases, I feel like if you have those instance variables available in the object, then the actual functions can be a lot simpler in some ways.Yeah.Paul: Totally. That's like a huge focus and making the object small so that it. It doesn't have too much, but that's how I began to feel with elixir is that I'm like, I just have 50 args and most of them I don't care about. Like I want to look at what's important to this method, to this method.It's, you know, this argument, but with functions you're like, which things important. Is the first thing? Probably not. That's probably just the thing I'm passing everywhere. And so I liked that ability to kind of focus in and know like, this object has these two instance variables everywhere.Jeremy: Yeah. It's kind of interesting to get your perspective because, it seemed like you were pretty deep into elixir if you had created, bamboo and ex machina and stuff like that, so it's kind ofPaul: Yeah. I was like way gung ho and, and then I started missing objects. And luckily with crystal and ruby, you still get a lot of the functional stuff. Like you can pass blocks around. Um, that's functions. You can use functions. But it's not the other way in Elixir, you can't use objects. It just doesn't exist.And then the type safety. I'm just like, I still run into so many errors and it was so frustrating. I don't want to do that.The main benefit I got out of elixir compared to rails, um, which is what I had been using and still use a lot of, was speed. That was really big. Um, in terms of bugs caught about the same, mostly because it's still for the most part dynamically typed language with very few compile time guarantees. Um, so I'd still get the nil errors. I'd still mess up calls to different functions and things like that. And so that's where I ran into crystal. It has the nice syntax. I like from elixir and Ruby. It's also very, very fast. Faster than go in some benchmarks.So it's quick. Plenty fast for what I need. And it has those compile time guarantees, like checking for nils. That's a huge one. and it also makes the type system very friendly. So it does a lot of type inference. And very powerful macros so you can reduce some of the boiler plate.And so that's when I kind of started getting into crystal was seeing Elixir I still got a lot of these bugs that I was running into with rails, but I liked the speed but I don't want to use Haskell and Elm doesn't exist on the backend. so I started looking at crystal.Jeremy: And so it sort of sounds like there's this spectrum, right? You have Ruby and you have, elixir, where you don't necessarily specify your types so the compiler can't help you as much. And then you've got Haskell, which is very strict, right? You have a compiler that helps you a lot. Um, and then there's kind of languages inbetween Like. For example, Java and C and things like that. They've been around for quite some time. how does crystal sort of compare to languages like those ? Paul: Yeah, that's a great question cause I did look at some of those other ones. TypeScript for examples is huge. Kotlin was another one that I had looked at because it's Java but better basically. That's the way it's pitched. And so far everyone that's used it has basically said that. And also looking at rust, what it came down to was how powerful was the type system. So crystal has union types, which can be extremely helpful, um, and it catches nil. Java does not have a good way to do that. Um, Kotlin does. But also boiler plate and the macro system crystal's is extremely powerful. Elixir also has a very powerful macro system.But crystal's is type safe, which is even more fantastic. So basically what that let me do with lucky, it was build even more powerful type safe programs. And we can kind of get into that once we, we talk about lucky and how that was designed. Um, but basically with these other languages, a lot of what we do in lucky just simply wouldn't be possible or wouldn't be possible without a significant amount of work and duplication.Jeremy: You covered a few things there. One of the things was, macros, what are are macros? Paul: Yeah. This is like a confusing thing. It took me a while to, to get, um, what it is. But, uh, in Ruby, for example, they have ways of, of metaprogramming. That are not done at compile time for most compile time languages, compiled languages, I should say. You need macros to de-duplicate thing, and basically what a macro does is it generates code for you.The way I think about it is basically you've got a method or a macro, but it looks like a method. It has code inside of it. And it's like you're copy pasting, whatever's inside of that macro into wherever you called it from. So in other words, rails has a, has many, like has many users, has many tasks that's generating a ton of code for you.So that's how Ruby does it. Um, and crystal has many would be a macro and it would literally generate a ton of code. And copy paste that into wherever you called it. Um, so it's just a way to reduce boilerplate.Jeremy: So in the case of dynamic languages, like Ruby, when you talk about Metaprogramming, that's having I guess, a function that is generating code at runtime, right? And the macro is sort of doing something similar except it's generating that code at compile time. Is that kind of the distinction? Paul: That's the way I look at it. there are people much smarter than me that probably have a more specific answer about what the differences are, but in my mind and in practical usage, that's what it comes down to in my mind.Jeremy: Let's say there's a problem in that code, what do you get shown in the debugger? Paul: Debugging macros is definitely harder than debugging your regular code for that exact reason. it is generating a code. So what crystal does, uh, there's different ways of doing this, but I like Crystal's approach. It'll show you the final result of the code and it'll point to the line in the generated code that caused the issue and tell you which macro generated it. Now, it's still not ideal because that code isn't code you wrote, it's code that the macro generated, but it does allow you to see what the macro generated and why it might be an issue.Part of that can be solved by writing error messages and error handling as part of the macro. So, in other words, making sure, if you're expecting a string literal, you can have a check at the top that checks for it to be a string literal. I wouldn't use them by default, but it's great for, I think a framework where you have a lot of boiler platey things that you're literally typing in every single model or every single controller, and that people kind of get used to. It's well tested. It has nice error messages. In my own personal code though, I pretty much never used macros. They're only in the libraries that I write.Jeremy: Another thing you mentioned is how crystal helps you detect Nils or nulls. Um, how does, how does the language do that?Paul: It actually uses union types for that, some languages that have this, they'll have an optional type, which is basically a wrapper around whatever real type, like an optional string, optional int, and you have to unwrap it. The way crystal does it is you would say string or nil, and there's a little bit of syntactic sugar.So you can just say string with a question mark at the end. But that gets expanded to string or a nil type. Um, so then within that method, the compiler knows that this could be a string, could be a nil, and there's a little bit of sugar there where the compiler, if you say, if whatever variable you have, it's going to know that within that, if it is not nil and in the else it is.So there's a little bit of sugar there as well. Um, but that's basically how they handle it. And there are ways to force the compiler, uh, just say, Hey, this thing is not nil you can call not nil on it. That's a little, I would avoid that because maybe the compiler's right. And it really is nil. Or maybe you change the method later and then it can become nil and you're going to get a runtime error there.But it does have those escape hatches. Cause sometimes you just need the quick and dirty and you can, if you need to.Jeremy: As long as you don't tell the compiler that, then you will actually have a compiler error. If you have a method that takes in, let's say some type of object or a, a, nil. And then you don't account for the fact that like it could be nil. Then the compiler actually won't let you compile, is that correct?Paul: That is correct. So for example, if you just had a method that's like, print. email and it accepts a user or nil, now, I'm not saying I would do that, but let's say that it does. And you just tried within that method to do user.email to print the user's email. Um, it's going to fail and tell you that nil does not have the method, email.And so you need to handle that. And then, yeah, you're forced to either do an if, or for example, you can use try, which is basically a method that says call call a method on this object. Unless it's nil, if it's nil, just return nil. But yes, it kind of forces you to do that.Jeremy: And in crystal, how do you handle errors? Because a lot of different languages, they'll have things like exceptions or they may have result types. What's sort of the the main way in crystal?Paul: I'd say I'd group it into two types of errors where. You have runtime exceptions still because things do break. Not everything is in a perfect world. Inside your type system, databases go down, you know, redis falls over or whatever. So you still have runtime exceptions and then you have the compile time errors, which we kind of just talked about.But in terms of how those runtime exceptions are handled it's I don't want to say exactly the same as Ruby, cause there probably are some subtle differences, but extremely similar to Ruby and that you're not passing around errors. It's so, it's not like go where you are explicitly handling errors at every step.Um, you raise it and you can rescue that error kind of like a try catch in other languages and you can also just let it bubble up and rescue at a higher level, which I personally prefer. Because not every air is something that I care about and kind of forcing me to handle every single error everywhere means that it is harder as a reader of the code to tell which errors I should care about because they're all treated as equal.So I like that in crystal, I can say this particular error, this particular method I want to handle in a special way. And somewhere up above the stack. I can just say anything else. Just print a 500 log it, send it to Sentry.Jeremy: Yeah, so it's very similar to, like you said, Ruby, or any other language that primarily relies on exceptions. Like I think Java for example, probably falls into the same category.Paul: probably. I haven't used it in quite some time, but I imagine it would be similar.Jeremy: You had mentioned that that crystal is like pretty, pretty fast compared to other languages. what are the big. benefits you've gotten from that raw speed? Paul: The biggest benefit I would say is not having to worry so much about rendering times, and rails for example. You can spend a ton of time in the view, even though everyone says databases are slow, they're not that slow in something like rails active record takes a huge amount of time to instantiate every single record.So how does this play out in real life? You could, for example, in lucky if you wanted to load a thousand records and print them on the page and probably do that in. a couple hundred milliseconds maybe, which is a totally reasonable response time. Same thing in rails would be many seconds, which is not reasonable in my opinion.And this can be really helpful, partly because it just means your apps are faster, people are getting the response as quickly. But also because you have a lot more flexibility. I've built internal tools where they want to have the ability to search all of the inventory or products or whatever else and they want to have like a select all or be able to select everything.And in rails, you can't just render all 1000 products cause it basically falls over and you can try and cache stuff. But then that gets complicated. Um, so you kind of have to paginate. But when you paginate that makes it hard to select things across multiple pages, it's then you need some kind of JavaScript to remember which ones you selected across pages, and it just balloons the complexity, right?If you know, Hey, we only have eight or 900 products, we're not going to suddenly have 20,000 in lucky. You just render them all, put them all on the same page, give them all check boxes, and it's. In the user's hands in 200 milliseconds and you're done. You just removed most of that complexity. So those are some of the ways that that speed is playing out. And I think one key difference there is some people think speed is just about scalability. How many people can be using this? The speed improvements I care about are the ones where even if you have one request per day, I want that request to be insanely fast. and so that's kind of what you're getting with lucky and crystal.Jeremy: When you talk about web applications, you know, with lucky being a web. Framework. A lot of people point out that a lot of the work being done is IO, right? It's talking to the database, it's making network calls. But I guess you're saying that rendering that template, those are things that actually having a fast language, it really does make a big difference.Paul: It does. Yeah. I, I think the whole database IO thing, a lot of times that's what people say when they're working with a slow language. If you have a fast one. It's not as big of a deal. Cause this was the same with Phoenix and Elixir. Like I loved, how quickly it could render HTML. That was huge.Jeremy: And like you said, that opens up options in terms of, not having to rely on caching or pagination or things like that.Paul: Yeah. This is huge. I mean, an example from work. We just announced github discussions. Um, and I'm on that team. And one of the big things we were trying to get working was, was performance of the discussions show page. You could have hundreds of comments on that page. And we were finding that most of the time taken was actually spent rendering the views and calling methods on the different objects to render things differently in the seconds. And we can't cache those reliably because there are so many different ways to show that data. If you're a moderator, you get certain buttons. If you're an unverified user, like someone who just signed up, you see a different thing. If you're not signed in and you see a different thing, and so you can't reliably cache those, and we had a lot of cool techniques to kind of get that down, but this is something that if this were written in lucky, it just would not have been an issue.Jeremy: And github in particular is written in Ruby, is that correct?Paul: It is. Yeah. It's using Ruby on rails, and I'm not trying to knock rails. I, I really love rails. I mean, I've been using it for 12 years. Um, I like Ruby. Uh, but Hey, if there's something that could be even better, I'm open to that.Jeremy: For sure. You have used Rails for 12 years. how would you say that your productivity compares in Ruby versus in crystal?Paul: I think that's tricky. It's kind of better and worse. And what I mean by that is. I think crystal, I am. I'm more productive. And crystal, you do have compile times and we can talk about that. They're not the fastest, they're not the slowest, but I do find that I can write more code and then compile once, and it kind of just tells me where the problems are and I have a lot more confidence and I spend a lot less time banging my head on like, why isn't this thing working?And it's because I passed the wrong type somewhere. however, Ruby has a massive ecosystem, so there are things that exist in Ruby that I would have to rewrite and crystal. and so that for sure, no matter how productive I am in crystal, is not as productive as requiring the gem and then just using it.So the hope with lucky though, is that we're building up enough things that. You don't have to be rewriting everything. And the community is also really stepped up and writing a number of, libraries that are super helpful for web development. Um, for example, somebody just wrote web drivers.cr, which makes it so that it can automatically install the version of Chrome driver that matches the version of Chrome that you have installed.So you don't have to manage that at all. That's something that was in Ruby for awhile, and will be in lucky, probably in the next release. So yeah, I think it's better. It's one of those things that will get better with time.Jeremy: So in terms of the actual language, productivity, crystal, it sounds like basically a net positive, but it's more in the the community aspect and how many libraries are available. that's where a lot more time, but it's taken.Paul: I think so. And then just the initial ramping up, uh, it is a new language and so there aren't as many stack overflow questions and answers and there aren't as many tutorials. So there's definitely some things there. But like I said, those are things we're working on, especially for one out of lucky. Try and make sure we have really good guides, uh, really good error messages.We tried to borrow a little bit from Elm. Not specific error messages, but just the idea that an error message should raise something human readable and understandable, and if possible, help guide them in the right direction of what they probably want to do, or at least point them to documentation to make it easier.So we're trying to help with that as much as, as we can.Jeremy: I kind of want to move into next more into your experience. building lucky. you know, you were a rails developer for many years, and are there any like specific major pain points, I guess, in rails or in your previous web development experience that you wanted to address with lucky?Paul: Yeah. There were, um, some more specific than others. Um, some easier to solve. In the sense that the solution is like it works or it doesn't. And others that are a little bit more abstract. So I'll talk about some of the specific things. I often said that I'm into type safety. I don't think that is quite true, and I think it. Especially if you haven't used lucky, it just doesn't click what that means or why it matters. Cause you just think like, Oh, so you know, don't tell me if I pass an integer instead of a string. Like who cares? I'm not seeing those kinds of errors.What I'm most interested in is compile time guarantees, whether that's with a type or some other mechanism. and that's there, not just to prevent bugs, but to help you as a developer to spot problems right away. And give you a nice error so you know what to do about it. So, for example, one of the things that I've seen in basically every framework I've ever used, regardless of whether it is type safe or not, is that you need to use an HTTP method, a verb and a path.So, for example, if you want to delete a user, you would have forward slash users forward slash one to be the ID. The tricky part is you have to have the HTTP method delete for it to do the delete action. But sometimes you forget that you use a regular link and you wonder why the heck it just keeps showing you this thing instead of deleting it or the particularly insidious one is when you have a update and a create. One uses post one uses put, if you have an update form and you forget to put the method put, you get all kinds of routing errors cause it says, Hey, this doesn't exist. And you went, well why? Why doesn't this exist? I can see it right here. I've got the route, I've got everything.Oh it's cause I forgot to put the HTTP method is a PUT. And it just waste time. So that's one of those things where we wanted to compile time guarantee and lucky. And so I don't want to go too in depth here, but basically what we did was we made every controller into a single class that handled the routing and also the response.Jeremy: If I understand correctly, when you have a page. And you want to link to a specific user, on that page. Then you would use this function link to, and you would pass in the class that corresponds to showing a user, and then you would pass parameters into that function. Like, for example, the id of the user.And if you didn't do that. Then you would have an error at compile time, notPaul: correct.Jeremy: you. You wouldn't need to like start the website and then go to the page and have it, basically explode, which I guess is typically what you would expect from most web frameworks.Paul: Or what's worse, it wouldn't explode. It would just generate the wrong link and you would have to remember to click that link or write an automated test that clicks that link. And so it's really easy for bugs to sneak in, and this just completely prevents that class of bug. As well as just makes life easier because if you forget a parameter while you're developing from the start, instead of just generating something with like a nil ID, it's going to say, Hey, you forgot this.It just saves a lot of debugging time, and I think it's also more intuitive if you've ever used rails helpers or Phoenix, help any of these man the conventions. Like it's a singular, isn't plural, is it? Does it have the namespaces and not have the namespace in lucky that it's gone. You just call the action, the one that you created, you call that exactly as is.Jeremy: It sounds like this is maybe a little more explicit, I guess? Paul: Yeah, it's a little more explicit, but I hesitate. I've heard a couple of things in the programming community. Um, one, the rails started as convention over configuration, which that was huge because you had to learn the convention, but at least once you did, you knew how about other rails projects were. And then another one I hear is explicit over implicit.I don't buy into either of those in particular. Um, because sometimes implicit is better, sometimes explicits better. I mean, for example, it was a quick example. I don't hear anyone arguing to bring back the old objective C where you had to manually reference and dereference memory that is technically more explicit.But does anyone want to do that? No. So I don't think explicit over implicit, you have to think about it. Everything needs to be judged, in its own context. And what I think is even better than convention over configuration is intuitive over inventions. Meaning you don't even think about it.You don't even need, there doesn't need to be a convention. Because you're literally just calling the thing that you created like anything else, there's nothing special about that. It's a class just like any other class and you call a method on it, just like any other method.I think it's tricky because I think it's also easy to say explicit over implicit and make your code super hard to follow. And it's like, yes, it's more explicit, but also I just wrote 20 lines of code instead of one. And those 20 lines could differ because I do it differently than the other guy or girl.Jeremy: Another thing about lucky that's a little different is that for templating, instead of having somebody write HTML and embedding language code in it, uh, you instead have people write crystal code.So could you kind of explain sort of why you made that decision and what the benefits are.Paul: Yeah, sure. So a lot of things actually with, lucky. Kind of I did not want to do, or were definitely not how I started doing things. And it just kind of moved in that direction. based on the goals. And I think that's part of what makes lucky, different is that we don't say, here's how I want to do it.We say, here's what I want to do and I want it to be easy, simple, and bug free. So. What we started with was using templating languages, just like you'd use in almost any, anything where you write your HTML and then you interpolate values in it. At the time I wrote lucky, and this may be changed now. you could not use a method that accepted a function or a block is what it would be called and crystal, and have that output correctly in the template. I think it just blew up. I don't remember, this was two years ago, three years ago. The other problem I was having was, it's not just a template. Any bigger size framework also has partials or you know, fragments or includes or whatever you want to call it. It also has layouts where you can inject different HTML in different parts of your HTML layout, and those are all things that a person has to learn when they're learning your framework. What are these methods called for.Generating a partial for calling a partial or injecting stuff in different layers of the layout. And it's also more stuff that I have to write. And with lucky, like there was already a lot to write. They were building the ORM and the automated test drivers and the router and like everything. So I can't afford to just do stuff like everyone else does it if it's not pulling its weight.So eventually. I started experimenting with building HTML, using classes and regular Crystal methods. Some of the requirements, um, for me when I was building it was it had to match the structure of HTML and it had to be very easy to refactor. Meaning I can pull something out into a new method and it just works.So easy refactoring. And then I also need to be able to do layouts with it. The reason for that is Elm also uses, code to generate HTML. However, it is not approachable to a newcomer. if for example, you have a designer and they pull up in an and try and look at what that, what that generates.No way. I mean. I'm a programmer, I still don't know what it generates without really looking through Elm. And that's partly because you are generating data objects. So arrays of arrays. Or maps or whatever else. so I didn't want that. It has to be approachable to people and look and be structured like HTML.And so we were actually able to do that. I don't know if I need to go into huge detail, but basically you can say, Hey, I want to div. Inside of that, I want an H1 underneath that. I want another div. And you're not building arrays and maps and anything else. What that provides is actually a lot of things that I did not think of.One super easy refactoring. If you have a link in a particular page and you don't want to copy that over and over and over, extract a method and you call it like any other method, there's nothing to learn. It's just a method. Like anything else, it can accept arguments just like anything else. Your conditionals work.Um, you can extract that into a component, which is basically another class and it tells you explicitly here's what I need to run. And it renders the thing. Um, you always have the correct closing tag. I have been bitten so many times by shifting stuff around. And forgetting a closing tag and my whole page looks wonky and I have to go through layers of indentation.That just doesn't happen if you forget an end so you would have a do end when you're creating these blocks, it blows up. It's like, Hey, you're missing one. And the coolest part is you just add an end in there and you've run the crystal formatter and it re indents everything perfectly. And then on top of that, it's, if that wasn't enough.Like I just loved how easy it was to refactor and use. you don't have to split up your code from your template. Like in rails, you would have a helper. So you've got like, here's your template, but then you might have a helper, a totally separate file. If you've got something that pertains to just that page, you can just extract a method.It's right there. But this also made it so we can do layout without any special work. Your layout is basically a class. You would say, here's my class with the head. It renders the head renders HTML body or whatever. And then it calls a content method or a sidebar method or whatever else, and your page.So if you wanted to render a list of users inherits from that class and implement a content method or a sidebar method. And so when that's rendered out, it just calls those methods. So we got all of that for free. If you look at our view rendering code, it's 50 lines. because basically we use a macro and give it a list of tags, like, you know, paragraph H1 H2 whatever, and generate a bunch of methods.And that's basically it. So from an implementation perspective, it's extremely simple. Plus, you get all these niceties around refactoring is super easy. It's super easy to tell what a page needs to render at the top of the page. You just say, you know, I need a user. I need a paginator. I need a current user.So you know what that page needs. You don't get that with a template. and you get all the power of crystal for rendering layouts however you want. that all basically came for free. So it was kind of a happenstance that templates weren't working and this has worked out better people, a lot of people when they see this, they're like, what the heck is this?I hate it. And I always just say, just give it a try. Just give it a try for a little bit. So far. One person has said like, okay, I don't like it, and you can use templates if you want. We've actually built that in, but everybody else is like, now that I've used it, I love it.Jeremy: What it sounds like is in a lot of, JavaScript frameworks, for example, like react, there's this concept of components, right? And so you can create, what looks like new HTML tags, but really has. some other HTML in it like let's say you have a a list of, uh, businesses and maybe you have a component that would have, all the business details in it. it sounds like in the case of lucky, you kind of can do the same thing. It's just that your component would be in the form of a crystal class. And so there isn't any new syntax. and you're not mixing, different languages. Like you're not mixing HTML and JavaScript. Instead, everything is just using crystal.Paul: exactly. you have two options. You can extract a private method cause sometimes it's just a small thing you want to extract only used by one page. Just do a method. If not. Uh, extract a class. And the cool part about all of this is that you don't need to restructure anything. Meaning you can start with everything in one method, in your content method, and then you can pull out just a little bit into a private method.And then if that's not enough cool, pull that out into a class so you're not forced into just pulling out classes all over the place if you don't need one.It really worked out kind of really well because it also makes testing easier. You can pull out a class component that just does one thing and you can instantiate just that component and test just that HTML. And once again, this is very easy because it's a class you call it and run it like any other class.And so that's been a big goal of Lucky is try to reduce, and this also comes down to the whole like convention over configuration is how do we just make it so there is no convention. It's just intuitive. Like if you know how to extract and refactor a crystal class, you know how to extract and refactor stuff for a page in lucky automatically. Um, and I mean, of course there's still some degree of learning and experimentation, but it's the same paradigms. if you want to include methods in multiple pages, use a module just like any other module. So that was very much a goal. And that's part of, uh, other parts of lucky, for example, querying in something like rails. The model is for creating, updating, reading, everything. In lucky you do create a model and we use macros to actually generate other classes for you, but you have a query object that is a class. Jeremy: What am I passing into my query object what does that look like? Paul: Let's say you have a user by default, it generates a User::Base query. So basically you have this new object namespace under the model. And by default, the generators generate an another file.And basically what that does is it creates a new class called user query. That inherits from that user based query class. What you would do in your controller action or anywhere, uh, say user query dot new by default. That just gives you a new query that would query everything in the database. Unless of course you overrode initialize and did something else. Then it would use that scope. so if you want him to further filter down, you would call, for example, if you wanted the name to be Paul, it would be user query dot.new.name parens Paul as a string. Because lucky generates methods for every column on the model with compile time guarantees. So if you typo that method, it's going to blow up. If you've renamed the column later, it's going to blow up. if you accidentally give it nil, it's going to blow up and tell you to use something else, but that's how you would do it.You say dot. Name is Paul. Or, uh, we also have type specific criteria and methods. You can do things like dot age. Dot. G T for greater than 30. And so you have this very flexible query language that's all completely type safe. So in your scopes, if you wanted to do something like recently published for a post, inside that method, you would do something like published at dot gt at.gt for greater than one dot week dot ago.And you can chain that. So you could do post query.new dot. Recently published dot, authored by Paul or whatever. So that's basically how it works. Um, you just have these methods that are chained, that you can build upon in pretty much any way you want.Jeremy: In a lot of applications now, people use JavaScript frameworks, whether its react or Vue or angular, what does integrating with JavaScript libraries and frameworks look like in lucky?Paul: I think easier than a lot in the sense that you can generate a lucky project with. different modes. So when you initialize a project, you can use just the command line with some flags, or the default is to walk you through a wizard, which will say, do you want API only? In which case, you know, it won't even have HTML pages or the default, which is a full app.What that does is it generates Webpack config for you. Um, it sets up your public assets and images so that they can be copied and fingerprinted. and so out of the box that already has a basic web pack set up for you that handles CSS. Um, it handles most of your ES6, JavaScript type stuff that people typically like.That's just handled out of the box. if you want to include react or vue. You would include that just like any other Webpack project in terms of building it. Um, and it's actually a little simpler. We use Laravel mix on top of Webpack, which is basically a thin JavaScript layer that calls Webpack underneath the hood.If you want a full single page app. That's also totally supported. Um, you would basically have just one HTML page that, you know, has the basic HTML and body tags and within that Mount to your app. So whatever that is for your language in vue, it might be, just a tag that's like main app.And then in your JS you would initialize, um, that tag with your app. And we have fall back routing so that you can do client side routing if you want. It's not particularly well-documented, which is the biggest problem. Um, some people are helping with that cause a number of people have done react and view.And so, um, hopefully those will be fleshed out a little bit more, but it's totally supported. in the longterm though, we've got plans to make it, so you don't even need those types of frameworks quite as much. since we already have class components and a bunch of other things, uh, I'm working on a way to add type safe interactivity to HTML.So you're not writing the Javascript, you're writing crystal for the most part, and it can interface with Javascript and you can run, you know, use react and vue inside of it. But a lot of your simple open close, if anything like that is going to be handled client side, but written with crystal and server interactions will also, those will be sent over an Ajax request, but will also be typed safe when you call the actions and do all the HTML rendering similar to live wire for Laravel or live view by Phoenix. But with some. Some differences that's not done yet, but it will be, and I think it's going to be really exciting. I've got a proof of concept up locally and uh, it's really awesome.Jeremy: We had a previous episode on live view and I think the possibilities of things like that are really interesting of, of being able to not have to have this sort of separation between your JavaScript front end, and your server backend yet still be able to have, the kind of interactivity people expect.Paul: Yeah, I think it could be cool. and that's also where speed comes into play. When you're doing interactions like that, you don't want to wait even a hundred, even 50 milliseconds. Is noticeable for those types of interactions. And so Phoenix also fast, really fast, template language. Uh, basically it gets compiled down to elixir, and so that helps a lot.Um, I do think there's some big flaws that I've seen in some other implementation. Well, I don't want to say flaws, that sounds a little overly harsh, but things that I personally, are just deal breakers for me. And one of those is some clientside interactions have to be instantaneous. I just have to be, if I click on my avatar on the top right, I expect the menu that has settings and log out to be instant.If there's any kind of latency in the network and it takes 200 milliseconds, even. That's going to be a weird interaction and it's going to feel like your app is broken. And of course that's exacerbated by people, not in your country. This is another problem. People are doing these things, deploying servers in their own country.Put a VPN in front of your computer in Australia or even the UK, 400 milliseconds. That's just, you can't do that for a settings menu or for opening a modal. And so there needs to be some way to do those interactions instantaneously. Live wire by Laravel, the same guy that wrote it, built our Alpine JS.Which is kind of, it looks a little bit like vue, but it doesn't have a virtual, DOM it operates with the Dom that you generate. That's what it uses for client side interactivity. So you can do the server side stuff, which I mean, if latency's there, you're, if you're submitting a comment, look, there's no way around it.You've got to hit the server. But if you're opening and showing something at a menu, a tab, a modal. That's instantaneous and is handled by Alpine. So lucky actually going to use that along with our own server rendered stuff to do client side interactions instantaneously.Jeremy: So Alpine, it's a JavaScript front end framework, you said, similar to vue. without the virtual Dom, and it sounds like what you're planning is to be able to write crystal code and have that generate Alpine code. Is that right? Paul: That's correct. Cause it's mostly in line and it can't do everything. But most of what I want from client side interactions are typically super simple things. I want to open and close something. I want to show tabs. And those are things that Alpine's incredibly good at because you don't need a separate JavaScript file.We can just generate something that says, it uses X as it's kind of modifier X dash click toggle the thing. True or false, toggle open to true or false and X if or X show and then if it's open or not. Those are things that we can very easily generate on the backend and make type safe because we can say, you know, this has to be a boolean and here's the action.And all those things are then type safe, but you can still do JavaScript if you want, so you can still use JavaScript functions in there with your Alpine if you need to.Jeremy: Yeah. That just sounds like the distinction between that and like a live view or a live wire is that my understanding is with those solutions you're shipping over basically diffs in your HTML, and that's how it's determining what to change. Whereas you're saying like, you may still have some of that but there's certain interactions where you just want to run JavaScript locally on the person's client, and you should still be able to do that even if you are doing this sort of sending diffs over the wire, for other things.Paul: Yeah. Exactly. Alpine's made for that. The biggest key differentiator between Livewire live view is the type safety, all those nice things that you get in lucky you're going to get also for your client side interactions. So if you have an action and you have a typo or something, it's going to blow up.It's going to tell you if you forget something, if you've missed the wrong type. I mean, and this is something that's very hard in the front end world because you either have to run an automated test to make sure you catch these or the worst. You have to open up the console. Because like, why isn't this working?I don't know. Now I have to dig into the console. It's not even where you typically want to see logs, and so being able to shift that to where you're used to seeing errors and before you even have to open the browser, I think that's going to be a huge deal.Jeremy: I think on the server side, testing is pretty well understood in terms of, you know, especially if you have API end points, or you have just regular server code, like people know how to test that. But on the client side, there's like so many different ways of doing it.It feels like, and a lot of them involve spinning up browsers and, um, it can get kind of complicated and so, yeah, it'll be interesting to see if you can shift more of that to the, the server environment that a lot of people are used to.Paul: Yeah, I think it will be cool. We'll see how it goes and yeah, I do think there's definitely complexity that comes with moving it to Javascript, especially if you have a single page app cause then you need to spin up an API. You need the the server and an API. When you use your Cypress tests or whatever, or a lot of people mock the API, which sometimes is fast, but can get out of sync, in which case you lose confidence in your tests.So having it in one spot, is I think really great. And we do have the capability to run browser tests that's built into lucky because I think it is still good to have at least a couple smoke tests for your critical paths. To test the happy path. Um, but I mean if you can write fewer of those, that's great cause they take forever to run.Jeremy: For sure. Yeah. Um. In lucky, there's a lot of features that in other frameworks would be not usually be included. Like for example, there's authentication. you have this setup check script to see if your app has all of its dependencies, things like that.I wonder if you could sort of explain sort of how you decided what sorts of features should exist in the framework versus being something that you'd leave to the user to decide.Paul: I think things If there's no downside for one thing, if there's no downside, only upside and almost everyone would benefit from it, I want to include it. So that's, for example, the system checks script. Um, we also have a setup script and that's what we tell people to use. Instead of saying like, first installed yarn and then run your migrations and blah, blah, blah.No our documentations don't even mention that. It's like run script set up. Um, and the idea there is, it serves as kind of a best practice. It kind of pushes you into things to say like, Hey, put stuff that you need in here. Then we lay it on the system check, which also runs before setup. And also every time you boot the development environment, um, where it'll check, Hey, do you have a process manager?Which you need. It'll check whether Postgres is installed and running, because that's required. so if you go back to kind of that criteria, it's useful to pretty much everyone. Meaning like, if Postgres isn't running and the app's not going to work, everyone would need to know that. Um, and it doesn't really have a downside.if you don't want it for whatever reason, you just delete it. Or stop running it, that's not a huge downside. That's like, you know, one click. So that's part of why that's included. I don't like spending time on things that aren't delivering actual real value.So I don't like spending time figuring out why my local environment is not working or why it was working and now suddenly isn't. And with something like a system check that makes teams happier in the sense that, let's say all of a sudden ads somebody adds a new search capability and it requires elastic search, and I do git pull from master, do my feature as soon as I boot the app, if they've added something to system check that says, Hey, you need elastic, it's going to tell me it's not going to just blow up.It's going to be like, Hey, you need elastic search now. Install that and run it. These are the types of things that I really think are gonna save, a lot of time in terms of auth. that's another one of those where it's like, so many people want it and it should be easy and simple and not like five different ways to do it, but not everyone wants it, which is why, we made it optional.You choose in the wizard, like if you don't want auth, fine. I guess that most people generate it with off. I know I do cause I need it. And the thing is, we also changed how auth works in the sense that it's mostly generated code. It's not just a bunch of calls to some third party library. So what that means is it is easy to modify.So if you want to add email confirmations or invitations or anything else like that. You're not mucking around in some third party library. It's code generated in your app that you can see and modify. So it doesn't lock you into anything. It's very flexible and it helps you get off the ground running.And that's why that was included. Uh, and I'm sure we're going to have other stuff that may be included or at least an option of being included in the future.Jeremy: Yeah. I think, one of the conversations that people are having now is particularly in the JavaScript ecosystem.you end up pulling in a lot of different dependencies. You end up having to make a lot of different decisions. And so it's interesting to sort of see, lucky kind of move back and in the direction of say, a rails of trying to kind of include the things that you think probably most people building an app are going to need. Paul: Yeah, it's a little more in that direction. I think on the flip side, rails is starting to include so much that people are starting to get mad almost, and it's like so much that you're like, what is this? What is happening. So we want to strike a balance there. And so part of that is being very careful about what is included.I think some of the things that are included in rails could just as easily be added after the fact, meaning, 20 minutes of work and you can add it. Those are the types of things I probably would not include in lucky if it's 10 20 30 minutes to, you know, add it and modify your app. And only 50% of people even want it.We're probably going to just say, here's a guide on how to do it and make it easy, but not do that as a generator, if that makes sense.Jeremy: What's an example of something like that that would be pretty easy to add in after the fact and doesn't necessarily need to be included?Paul: Um, well, in rail six, it's coming up. They have this action mailbox thing that handles inbound emails. I'm pretty sure by default that is included. I could be wrong, so don't quote me on that. But I've been seeing a lot of Twitter stuff lately of people being super pissed about it, so I think it's there.Um, that's something I definitely wouldn't include because I think I've written one app ever that uses inbound emails. I mean github does too, but I have not written that and a lot just don't have that. So it's odd to include it, especially given the fact that it's not particularly hard to set up yourself. I think based on what I've seen, or action text is another one where it has ways of making rich text editing easier. That might be something too where. It could be added on later that I think, at least as a little bit more merit, because I think it's fairly common for at some point to be like, we need a rich text editor.Um, but those are the kinds of things that I would probably push off. And it's not a best practice either. Meaning I think it's smart that it has active record by default and chooses a database for you. Um, because it's best practice to just use active record. Right. And you're gonna have the best time using active record.Cause that's what everyone uses. So including that makes sense. But yeah, something like action mailbox is like what's the benefit in including itJeremy: Yeah. Just because the majority of people who are writing applications, they'll never need that inbound email feature. Uh, as opposed to, your example of authentication, like probably the majority of applications people are building will have authentication in them.Paul: Exactly. Yeah. And it's something that's hard to add, meaning um it touches so many parts of your application. And because we are generating stuff, it's not easy to add after the fact. but stuff that. Is easy to add and easy to remove that. Another criteria is how easy is it to remove it? So we include a few default CSS styles, but super easy to remove.It's basically like you go to your application dot CSS, it's like delete everything below this line. You delete it and it's like you're done. But it's nice because it makes it look decent and not like a horrific, ugly thing when you start your app, but it's easy to remove. And so that's something, for example, that we also include by default.Jeremy: That's also, I think the distinction between something that's generated code and, or configuration that the user can see. Um, I mean, I think your. Set up scripts and your system checks, scripts. Uh, one of the things that makes those kind of more straightforward is the fact that they are in your code base and they're their bash scripts, right?So, if you want to modify it or you want to remove it, they're, they're kind of right there. Whereas something like a action text or action mailbox is probably in like. the rails gem, right. It's in the library, so you don't even see it in your code base. I guess that would be the distinction there.MaybePaul: Yeah. Or you might, but you don't know why it's there or what it does. Or. Yeah. Another concern is how many things does it hook into? so for example, one of the big things is, like I said, do default styles. How many places does that hook into things? Just one, you go to your, your main CSS file and delete it, but there's a way to do that that I don't particularly like.I've seen some people, for example, use bootstrap or any framework. It doesn't matter what it is. The problem with those is it also modifies the generated HTML and the scaffold. Cause by default it's adding classes like column three, medium button, blah, blah, blah, blah, blah. If you don't want to use bootstrap, you have to remove bootstrap and manually go through all of the generated HTML files to remove the bootstrap classes.And so that's like a key difference too is how easy is it to remove. And we've really want to only add things that are easy to remove or really hard to add.Jeremy: What, what is the, uh, adoption of lucky look like? Do you know of people using it in production currently?Paul: Yeah. I don't have exact numbers. Um, which I think is good because it. Reduces anxiety a lot. Not knowing is like, is it going up? Is it going down? But people are using it in production. a lot from the very early days of crystal, one of our core team members, um, Jeremy, he's been using it at work for two and a half, three years.And they've had great success with it. They replaced some of their rails and microservices with lucky, uh, originally for the performance boost. And I think this is common. They stay for all the nice type safety and the reliability they get. Um, it's hard to explain with just words, but then you use it and you see an error and we try and make them nice.Not all of them are. But we try and make it nice and people go, Oh, this is nice. Or people are annoyed that they see this compiler error and then realize, Oh, wait, actually did catch a block. So, but they're having great success. Um, big performance boost. like something like they reduced their number of servers by like 70%, and their response times got cut down 60 or 70%.So yeah, they're having great success and then a few other people are building client projects using lucky. I don't know what they are. Some people, there's just not, they can't say to the public, unfortunately. But yeah, people are using in production, which is really exciting.Jeremy: Looking at, the crystal community, what does that look like? you know, is a pretty active what are your thoughts on the community?Paul: Yeah, it's quite active. Um, they've got sponsors, quiet, quite a few corporate sponsors, so they're making decent money to help fund development. They're aiming for 1.0, I don't know exactly when, but they did a blog post. I'm saying it's going to be soon, and I've talked to them in person about it, but I don't know how much you know, I was supposed to say, but soon.Um, which is fantastic because then you're not going to have to deal with the breaking changes, which have definitely been happening the last two years. And I think it's good because the language is improving and changing things. But once 1.0 hits, people are going to be able to jump in and they're not going to have to update their apps every 3 months or whatever.Um, but yeah, a lot of participation, and the sponsorship money goes a long way. A lot of the development is based in Argentina and the dollar is super strong over there. so meaning if you've got corporate sponsorship in dollars over here, that goes a really long way towards the development. Um, and they're all super nice.I've talked to a lot of them in person. Um, super nice, super smart guys. The community itself in terms of forums and chats, that's where I'm a little hesitant. It's, it's active, but I think not particularly welcoming for newcomers. just really strong personalities, very smart, but very strong personalities, and I would say.It may be better to come to the lucky chat rooms. We're very strict about our code of conduct and not about nitpicky things, but just in general that, you know, you talk to people with respect and empathize, and we're not the type of people where you come with a question and we're like, well, did you Google it?We're going to try and help you. And so I think it's a very welcoming community. And even if you're not using lucky. Feel free to hop on our chat room. Um, if you go to the lucky website, there's a link.And uh, yeah, we're pretty nice over there. So things are moving forward. We're trying to get to around the same time as crystal. Um, maybe a little after, but I think that'll be a big milestone.Jeremy: it's interesting talking about the community. Because I think when you think about Ruby one of the big parts that attracts people is not just the language or the framework, but it's, you know, having an inclusive community, having people that are really friendly.so it's good to hear that, lucky are striving to do that. Like why is there that divide?Paul: Uh, I'm not entirely sure. I mean, part of it is I am a sensitive person, and so I am kind of trying to create the community that I want, which may be actually way more, upbeat and positive just because, I want new people to feel comfortable. and I think maybe part of it is, with crystal, they don't have that much time, I think, is part of it. And so it's easier to brush stuff off. Some of it could be just that they don't care about the same things that I personally do.There's nothing actively bad going on. It's just I prefer things rather than to just be okay or average. I want it to be exceptional. And a place where it's just like, don't worry, you can say something. If it's, you feel it's dumb. We're not going to be like, pile on.We're going to be like, Hey, it's fine and here's maybe an alternative. so yeah, I mean, go to the crystal rooms. I still do. I still get help. There's a lot of really smart people. Um, you just gotta put on like a little thicker skin and be prepared for like, why do you want to do this? Have you tried this other thing?Have you done this other thing? In a way it's a good thing because they're making sure that like you've tried your different options and you're not just asking to do something that's a horrible idea, but it can make people, I think, feel like their ideas getting attacked or whatever.Um, so that's what I mean by part of it is just like, if you're sensitive, that's gonna come off as probably harsher than it was intended. Um, but you can still get a lot of help.Jeremy: Yeah, I guess it's just trying to find the right level of, yeah. I don't know what the word would be, but, yeah. Making people feel comfortable.Paul: Yeah, I do have a really high bar for that because like I am sensitive and I grew up when I learned to program all online with books and with forums. And I remember how hard it was as a new developer that didn't know best practice and people would be like, why are you even trying to do that?It's so stupid. And it's like, dude, I've been programming for like six months, calm down. And, I think it's common. I think, I mean, that happened in the Ruby forums happened in the rails forums. it's a common thing, I think across the internet and various communities. So it may not even be that Crystal is particularly bad.It's probably a lot like most communities, but we just want ours to be exceptional. Um, in terms of making people feel welcome and you know, if someone has a bad idea and air quotes bad because maybe it's a great idea and we just don't have the context, but if it is a bad idea, we're not going to say, why are you doing that?Blah, blah, blah. First, let's help you solve your problem and then talk about how might this be better? Maybe there's a better way to do it. And. It just feels a lot better. People are more accepting to have your feedback when you're not just immediately jumping on them and say, why are you even trying to do that?Um, and so I think that's important. Uh, yeah.Jeremy: Yeah. I mean, I think that probably applies to really all projects, right? Like they could all kind of stand to learn from some of that. And Kind of see it from the other person's perspective who doesn't have sort of all the. The same knowledge that you know you've been building up and maybe they can bring you a new perspective as well that you didn't, you didn't even think about.Yeah.Paul: Yeah, totally. I mean, we've changed stuff in lucky a lot of stuff that I was pretty sure about and they asked if it could be done differently, shared their use case, and it's like, Oh yeah, I made a mistake. And so it's good for everyone. Like if you show a little bit of vulnerability and openness, you're much more likely to learn.And you're much more likely to learn new and novel things because the people with, the strongest opinions are often the ones that have that opinion based on some principle they read about or a talk or something else. It's the quiet people that are like, Hey, can we try doing this like a little differently?And you're like, Whoa, I've never thought of this. Because no one else has, but you're new. You came up with this great new, innovative idea and he felt comfortable sharing because we're not just shooting people down constantly and so yeah, I wish more communities did that in general because it's mutually beneficial.Jeremy: That's kind of a good place to start wrapping up, but where should people go if they want to learn more about lucky?Paul: First place, luckyframework.org. That's the main website. It has guides. it has blog posts that you can follow or subscribe to with new announcements. Uh, and it has a link to our chat room as well as the github. So that's where I'd go. Feel free to hop on the chat room anytime. Um, we're all really helpful, and try and be nice. And so like, people shouldn't hesitate to run in there if they have problems. Um. If there's stuff that's confusing. Um, feel free to open an issue on lucky. We have a tag that's like, improve error experience. So we're, we have dedicated stuff just to do that. Yeah. In fact, if you start a lucky project, then you get a compile time error when you first start or are fresh on a project, it says, Hey, if you're still stuck, go to our chat room and ask for help. Everyone should feel free to do that.Jeremy: Very cool. and how can people follow you and see what you're up to.Paul: @PaulCSmith on Twitter. Probably the best way to do it right now. Maybe one day I'll have a blog or something, but right now it's Twitter.Jeremy: Cool. Well, uh, Paul, thank you so much for coming on the show.Paul: Yeah. Thanks for having me. I really enjoyed it.

Aug 26, 2020 • 1h 14min

Life after JPEG with Henri Helvetica

Henri is a frequent conference speaker and organizer of the Toronto Web Performance and JAMStack meetups. We discuss: Managing images with features like lazy loading and the picture tag Handling varying network conditions on mobileMaking designers a part of the performance conversationThe WebP image format that could replace JPEG and PNGWays the GIF can be an MP4 in disguiseHow lighthouse has given websites a visible target for performanceWhat we can learn from "lite" news sites Conference Talks A Decade of Disciplined DeliveryShape Of The WebMoving Pictures: A Snapshot At the Future of Web Media Related Links @HenriHelveticaCloudinaryPicture tagNetwork Information APIWebPResponsive images done right: a guide to picture and srcsetUse srcset to automatically choose the right imageMozilla: Improving JPEG Image Encoding (Mozilla explains why they want to stick with JPEG in 2014)The Great JPEG 2000 Debate: Analyzing the Pros and Cons to Widespread AdoptionHow JPEG XL compares to other image codecsServe images in next-gen formatsJPEG XRHigh Efficiency Image File FormatUsing HEIF or HEVC media on Apple devicesAVIF for Next-Generation Image CodingAV1 & Media CodecsWebP is now supported on Safari 10 (WebP support was added in a Safari beta but then removed)Safari 14 Beta Release Notes (4 years later, WebP is officially added to Safari)HTTP Archive Almanac: Image format usage (Shows the relatively small footprint of WebP)Native image lazy-loading for the webHow To Defer, Lazy-Load And Act With IntersectionObserverHow Medium does progressive image loadingAbove the fold in web designAn update on mobile CPUs and the Performance Inequality GapWeb Page TestLighthouseCNN liteNPR text onlyUser Timing API - Measuring User Experience Performance People mentioned during this episode @burkeholland@kornelski@andydavies@patmeenan@slightlylate@souders Music by Crystal Cola: 12:30 AM / Orion Transcript You can help edit this transcript on GitHub. Jeremy: [00:00:00] Hey, this is Jeremy Jung and you are listening to software sessions. This episode, I'm talking to Henri Helvetica. He's a freelance developer with a focus on performance engineering. He's also involved in the Toronto web performance and JAMstack meetup groups. And we discuss why images and performance are so tightly tied together. We also went deep into what life after JPEG might look like with the introduction of formats, like web P. And we talk about tools that can help you during your web performance journey. Henri is a big runner so i asked him if he started his day with a run. Henri: [00:00:36] Thank you for the introduction. And good morning. And with regards to a run, I wanted to first thing in the morning, and as we were talking about, getting up early just moments ago, I have my alarm set for 6:30. I tend to sort of open my eyes up around quarter to six, and figure out like how this run is going to go, but it was raining this morning. so I was a little upset, went and looked outside. The rain stopped by 7:00 AM. I was thinking, okay, maybe I should head out now. And as I was getting ready to head out, the rain started again and I thought to myself, okay, it's not going to happen simply because I knew I was doing this podcast. So I want to be back in time and fresh. And, and afterwards, I think I'm going to watch, this, Microsoft event they have online, MS create. yeah, run does not look good today. And it's funny. I was speaking to Burke, Holland from Microsoft. He said that he sent the clouds my way, knowing that it would force me to stay in. Jeremy: [00:01:34] They are a big cloud company, right? Henri: [00:01:36] That's a good one. I like that. That was good. I'm going to have to keep that one. Jeremy: [00:01:42] You're, you're pretty deep into the web performance space. What are some of the biggest mistakes you see people making on the front end? Henri: [00:01:52] I mean, web performance I, I consider it a bit of a dark art. There's lots involved and, and much of it may not seem, very clear to the sort of like average developer at times. but with any auditing that takes place, whether it be web performance or accessibility or, UX, overall, you're always going to have some low hanging fruit, and, One of those fruits is, image management. and I think that, you tend to find a lot of people sort of disregarding the importance of making sure that images are set properly, as a resource loading on your page. and. It's important for a number of reasons. most notably is the fact that it's always, absolutely going to be the heaviest resource on your page. Okay. Barring video. and you know, video in the last say couple of years, specially, especially this year has become a lot more prominent. So I mean, that's a bit of a different conversation, because, you know, you could quite often find pages with no videos. So I didn't want to go too deeply into that topic, but, you know, you will find images 99.9% of the time and images are challenging. Image management has become a lot more complicated, for a number of reasons. Retina screens brought in a particular challenge with regards to how to select the right image. and then you also have more than ever people are really paying attention to, connectivity, understanding that, connectivity may vary along like a five minute period, what was 4G at the start of your walk might suddenly downgrade to like very poor 4G or even like moderate 3g. Then you might go into your home and back out. And so you have varying conductivity that, ultimately the site doesn't care about. It's like, Hey, just load this image. You have these things to take in consideration and you luckily have some very brilliant engineers out there that are trying to make these accommodations. So, I would certainly say, images, Have been, are, and potentially will continue to be, one of the bigger challenges in terms of a low hanging fruit. Jeremy: [00:04:15] I want to go a little more into images. If you have the most basic case. Let's say you're not building a single page application. You're building a traditional, like just a document website. What are some of the ways that you should be treating images? you mentioned retina, how do we ensure we're only sending retina assets to people with retina devices? Should we be loading images in a lazy fashion? Like what are some of the best practices there? Henri: [00:04:47] The ultimate best practice, at this point, and it's a bit of a cop out, but, it would be to, outsource the work. and I say that simply because I think it's, it's become, enough of a challenge that there are some companies out there that are solely set up to do that work for you? Obviously people like Cloudinary and there's a bunch of others as well, that have been very upfront and outspoken in their need to let people know that this is the kind of work that they do. And, that's their specialty now. Barring that, there are a number of ways you can, look at, managing images. Obviously I'd say, one of the earliest revelations that came along the way when dealing with images and dealing with a retina, non retina, when we had and obviously, formats as well. picture tag srcset, and the ability to pinpoint what you wanted to send under what conditions. And, that was fantastic at the time. And, and everyone felt like, okay, we've, we've come up with a great solution, but along the way as well. What ended up happening is that you had these ever growing blocks of code. And I believe it was Brad frost once, posted, I think a screenshot of a code block just to handle, retina, not retina, et cetera. And it was so huge. He just sat there and was like, I'm not going to do this, there's no way, I need this massive block of code just to serve a couple of images under the right conditions. Things like that came about, and obviously as much as they did work, there was a sense that, things need to be a bit more simplified. I mean, people are still working on that, but I mean, what that also didn't do is. take in consideration, things like network conditions. And so you can have like this amazing, beautifully scripted block of code for images, but that still didn't take into consideration whether or not you were getting proper, bandwidth and, good, round trip times and whatnot. And that's where things like, network information API came around and whether or not you want to serve a particular images under particular conditions. And that's where it starts to get pretty complicated. Jeremy: [00:07:23] And this code you were referring to, this is all JavaScript? Henri: [00:07:27] Oh no no. Th th this is all just like classic HTML. I mean, we've not, we've not even gone into the JavaScript element yet, But, no this is all just like straightforward HTML that was there for you to sort of manage, images as best as possible. And, and like I said, the code block could just grow very quickly. If you want to just have like all your options covered, right. or conditions should I say? but again, none of that really delved into the idea that, Oh, we have variant network conditions too. And that sort of threw like an additional curve ball into, what seemed like very simple rudimentary work, you know, loading up an image of a cat, but, it's not the case anymore. Jeremy: [00:08:15] In CSS, for example, there are things like media queries where we could say, the device's screen is this size. So I'm going to send them this type of image. are, are these the types of things you're talking about when you're talking about serving different images to different browsers and devices? Henri: [00:08:34] With regards to different browsers specifically, the picture tag was actually a bit of a revelation there because we had situations where at one point. There was kind of like a fractured, landscape of, image support. you may remember, at the time when web P was starting to, make its way into the conversation. Even though the webp is like 11 years old, but I feel like, you know, even to this day, a lot of people, like webp what's that I'm not sure. And I remember a couple of years ago when I went on this sort of like image, you know, format management discussion, at conferences, there were people who had no idea what the webp was. To go back, in history, when the web P was really started to be introduced by Google and, supporting, browsers with the blink engine, there was a moment in time where, Mozilla felt that, we had not extracted all we could out of a JPEG. So their sentiment was that, a sudden introduction of a web P might've been premature. And in fact, there was a, a blog post that it described their decision into maintaining their sort of support, and additional research into, getting, better compression over the JPEG. A blog post that has since vanished, but I think if you go to the Wayback machine, you might be able to find it. And then on top of that you had the idea that the JPEG and JPEG 2000 and JPEG XR were sort of still out there floating around for people who want to experiment and really, really dive in a little more, because at the time you had CDN companies like say Akamai that we're working with, big retailers and, they had obviously a lot invested in making sure that, they could squeeze all the data they could, out of, certain assets like images. So you could have say a website, like, I remember in one of my talks, I gave this example like forever 21, I could talk about a company that's, gone bankrupt. So it's not like I have stock in that company. Jeremy: [00:10:44] Yeah. They're not going to come get you now. Henri: [00:10:46] Exactly, exactly. Right. It's like, here's a couple of pennies, man. Let me give me some, give me some stock. but, I, I, remember in my talk, I showed that, in devtools in different browsers you saw, you know, the JPEG 2000 being served and you saw the JPEG XR being served. You saw obviously, in Mozilla's case, a JPEG being served. Now I believe in Chrome. you were getting the webp. So the picture tag definitely helped with that, you know, with people who really want to be, very focused and, and trying to serve the best, format and most compressed. option possible, you know, very, disciplined delivery of assets because that's what it is at the end of the day. Trying to be as disciplined as possible and trying to find the absolute best possible, solution to, to sort of lessen the load. Jeremy: [00:11:38] Is WebP kind of the equivalent of a JPEG. It's another lossy, image format, but is perhaps more efficient Henri: [00:11:46] So the WebP is a very interesting format. So, history. WebP came from, a video format. So the WebP is actually a product of the WebM. Some of the more interesting and more, data efficient, image formats are all actually stemming from video, which is really interesting. So, the HEIC, which came from the HEVC, and, and then very future conversation, AV1 birthed the AVIF. AVIF and again, that's video to still image, but let's get back to the WebP it was made for the web, essentially. Visual fidelity it may not be the best format, but in terms of what is best for the web resources being transmitted down the wire, the WebP makes a great case. and I'll list a couple of features real quickly, obviously a very aggressive codec compression is 10, 20, 30%, better than PNGs. And better than JPEGs. Their chroma subsampling, is locked at 420. So for those who may or may not know chroma subsampling, basically it has to do with, it's sort of like the removal of certain colors that, may not be super perceivable to the eye. And so the fidelity remains to an extent, and essentially they removed some data that you probably, you know, yeah the average person wouldn't really catch. And it also had transparency, which made it obviously a lot more attractive because obviously the JPEG didn't have that. And, at one point, actually, and I've mentioned this a few times and you know, I was lucky enough to have a conversation with a Chrome PM. just about four years ago, he had mentioned to me The web P they had specifically the PNG format in their sites, as the felt that, feature for feature, they're aligned well enough that they felt that they could replace every PNG on the net with a webp. I also say that because the WebP came in two flavors lossless and lossy. Obviously the lossy one being the most attractive, but there is also a lossless option. So for those who really want to hang on to that fidelity and they, they refuse to let them go. There's a lossless format as well. So, the WebP on paper was an attractive format. But early on, some of the challenges, was encoding, was support. For people who are just so used to PNGs and JPEGs and, and God forbid GIFs, the majority of the software out there had just endless support for those three, even SVGs, but the WebP not the case. There was some work involved in trying to get the WebP into sort of like the ecosystem, but it wasn't going to happen without some of the more proper software outfits not supporting it. Say for example, Photoshop, there's a bit of a potentially outdated plugin that's been around and now not even supported by the original company that put it out, for Photoshop. And, you can sort of go through the litany of other sort of popular software, outfits that may or may not be supporting it to this day. Jeremy: [00:15:15] In terms of the best practice for images, you said there's a picture tag that will let you use a different format depending on the user's browser. So if you were using Chrome now, I suppose that it could send the user a WebP. but if they were using Safari, maybe it would have to send them a JPEG or PNG. Henri: [00:15:39] Yeah, it's funny you should mention Safari. I can go back and finish up my WebP story. Web P was slowly gaining and I do mean slowly gaining some recognition, I don't want to list it as popularity, but some recognition. Mozilla had doubled down in the JPEG. You had the WebP, JPEG, for sure. And then whatever else you wanted to use that was on the fringes of popularity, like the JPEG 2000 and the JPEG XR, 2000 being supported by Apple, and XR being supported by Microsoft. Then a significant moment in format history, Mozilla had a moment of clarity and they reopen, the bug to provide support to the WebP. And which was a bit of a, you know, I don't want to say a shocker. But, had sort of decided that, Hey, you know what, we probably need to do this. So they reopened, that bug, again in history, and sort of significant in a sense that for like a week, webkit, specifically Safari supported webp. And it was like a very bizarre moment. And he got pulled ASAP. It was like grand opening, grand closing, literally. And I could send you that link and this had an article about it. that was like, you know, no knew what was going on. But at one point we'll say about a year to two years ago, I think. First of all, you had Chrome and Blink engine supporting webP. You had Mozilla who had finally announced that, it was in nightly. Also somewhere in between, Edge had moved over to Chromium and they sort of quietly announced that they had WebP support. So, at the top of 2020, you had three of the four majors all supporting WebP and then hell froze over about a month and a half ago, at WWDC at Apple headquarters. And they announced WebP support for, Safari 14. So basically in about a couple of years, you went from one to four of the majors supporting web P. and it is significant in a sense that, A well known, image researcher, developer, engineer, Cornell. I can never pronounce his last name, but, so I won't, but, I'm a big fan of his work. He's actually the author of imageoptim. image, optimization tool. He put out this tweet saying that he felt some point next year, WebP could essentially be the only format you need. and it actually does make some sense, because if you're going to have the four majors on board and, and all the other browsers who are running off the blink engine, You know, we could, we could see the, the web P format climb significantly in, in presence on the web. Because as of right now, if you go to the HTTP archive, WebP is still in relatively trace amounts, on the web. and again, for a number of reasons from like tooling to developer, knowledge. It's going to be pretty interesting to see what happens. Jeremy: [00:19:09] WebP it sounds like it's going to be able to replace JPEGs and PNGs because it has that lossless option. How about, how about GIFs? Are we going to be able to have animated pictures in WebP? Henri: [00:19:22] The big WebP proponents will tell you yes. And, I mean, I'm going to go back to that statement you made with regards to, why the WebP is going to be, replacing the JPEG and the PNG part of the reason and I think specifically why it feels it can replace a P because I think essentially people will still reach out to the PNG as a lossless format for that transparency element, and no other image format out of the classic four, had that as a lossy option. So now back to what you were saying with respect to the animated GIFs, As much as I'm not a huge fan of the animated GIF, we have to live with the fact that people love them. Jeremy: [00:20:13] and they definitely love them. Henri: [00:20:15] they absolutely like imagine Twitter with no animated gifs. it's almost like it'd be like empty but it's, it is interesting because, For the most part, the animated GIF has been replaced by the MP4. Quite often when you go into dev tools and, and look at, The entrails of this GIF, you believe you sent off a, it's actually an MP4 now, that was done for a number of reasons, specifically, storage because, MP4 versus GIF in terms of storage, huge difference in terms of size and, and some of these GIF farms realized that very early and, you know, you can have, storage costs, balloon out of control, just cause, you know, you want to carry a GIF. That was part of it. I've actually never, I mean, I shouldn't say never, I've seen an animated WebP once. Like I'm assuming it probably loops as well. So on paper, yes. There would be probably an argument for that to take place, but also for that to take place, the services that are out there, giving you these GIFs will have to on their own, do the encoding. So you can sort of like drag this, this animated, image, and hopefully it will be a WebP that'll be the one, early challenge. And, and again, I get back to the idea that we as just everyday persons have the, the tools and the encoding capabilities either on our phones or our computers to just say, Oh yeah, I want WebP. So there is a bit of a developer education that's going to have to take place, let alone consumer education right? You know, the average person knows what a GIF is. Devs don't know what WebPs are. So I don't, I don't imagine, individuals will. So there's going to be that I think hurdle along the way. But on paper yeah, it would be, it would be, probably an apt replacement. Jeremy: [00:22:25] That is interesting about how you mentioned a lot of the times where we would usually use a GIF. We now use an MP4, because that makes me wonder when. Someone is in a discord room, for example, and there's animated images everywhere. even if somebody thinks they're using GIFs, those may actually all be MP4s. Is that, is that right? Henri: [00:22:47] Absolutely. I'll tell you a quick story. I remember, one of my earliest talks on images. I'd mentioned that and the next day. I ran into a developer, and a speaker actually, who was, at my talk. And, he mentioned how they didn't believe me. And they went into dev tools. And, it was like, So, for everyone who didn't see that I just made this sort of like a mind blown face. and, and they told me that like, Hey, I was suspicious of that comment and I went into dev tools and they had no idea. Jeremy: [00:23:27] Yeah. Henri: [00:23:29] I actually felt good for a minute, you know, but, but yeah, that's, what's happening. And, and again, we're talking about, The availability to still have, the animation, but to save on storage, and even though, a gift may have that classic choppy look and feel, and you're like, Oh, that's gotta be a GIF, but it's just the choppy look and feel as an MP4, it's being done, because they do have the capabilities. Now getting back to that WebP conversation, whether or not they'll be able to, get all that encoding done. and, and, and suddenly, you know, their terabytes or petabytes of GIFs are going to be turning into, WebP. I'm not sure, but we'll see. Jeremy: [00:24:15] Yeah, but it sounds like if they're being turned into MP4 files to the end user, it really doesn't matter. Henri: [00:24:24] Ultimately that is, the challenge, right. as developers, we're making sure that, you know, we are disciplined as possible, but the end user doesn't care, is it looping? Does it work? can I post it on my page? That's it, very early in fact, there was a situation where, Facebook who have been, very aggressive in, exploring performance, opportunities and how to save, data, Quite often they're very early adopters. There was a point where they were starting to serve webp and they found out people were often just dragging stuff to the desktop to share either with friends or somehow. And they're realizing that, the WebP was being supported in the browser, but nowhere else. And so there, there were some complaints. And then at one point, I think they were trying to do something a Chrome where when you drag the WebP out of a window, it would be encoded into like PNG a by the time we got to your desktop little things like that. But what that described is the user experience had to be absolutely seamless. People do not care. And they just want to know that the images went from their window to the desktop, or they could just share it with a friend, you know, an iMessage or whatever it is and that's it, that's always part of the challenge, right? Making sure that, that the users can have like a very seamless experience in sharing, in social media, Jeremy: [00:25:54] Yeah, that's interesting because it reminds me of an iPhone in a lot of cases, if you take a photo, it's not. A JPEG, it's a HEIC format and you send that to someone, who can't open it. And you're kind of like what the heck is going on. Henri: [00:26:10] Absolutely. it's funny, you should mention that because I remember when Heath and you know, that whole ecosystem, was being introduced, at this one WWDC. It might've been two years ago and you just saw Twitter, just kind of not explode, but just going through this, like, what's HEIF, what's HEIF or what's going on? Hey, Hey, and I remember I gave a talk within like two weeks about HEIF and. Nothing happened. Even within the, Apple ecosystem, that format wasn't even supported by Safari. I think it may have been supported by, maybe image capture, but it had limited support outside of the iOS ecosystem now. I don't want to sort of like get into like the iOS business, but, if you take your phone and you go into say, a timed shot, like three, 10 seconds, whatever it comes up as a JPEG, which is weird. And I believe the front facing camera, I don't think does a HEIC shot either, but. If you do the, normal sort of like, backside camera shot, that's not burst. I believe it comes up as a HEIC. it's like super weird. and it's very bizarre because now you're talking about, them adopting HEIC or HEIF being the only ones. And now. Providing support for webP, which is super interesting, but that may also have, to do with the fact that, there are patents around, HEIF, and HEIC, and that's something that I've, I've come to sort of, discover. And, and, and why the support for, open source formats like WebP, and a few of the others, like, AVIF that I talked about, are, are significant, because I think the opportunities are there to sort of bypass those royalty payments. Jeremy: [00:28:06] The current encodings that we use now are any of those, patent encumbered, like are the browser vendors paying royalties for those? Henri: [00:28:16] With respect to the WebP certainly not, None of the browsers are supporting HEIC. So there's probably no payments there that's part of the reasons why, I believe, some of the future, formats that are patented maybe challenged, they'll still make some money, but I don't think that they'll see the sort of windfalls that they have in the past. Just cause there's so much support behind, open source, formats, you know, I'll give you a quick example. Let me know if I'm going off course here because I have all this stuff racing through my head. so AV1 it's an open source video format and AVIF is the sort of, the image format that's, born from AV1. AV1 is being supported by two to three dozen companies, all companies that have a very vested interest in video. And, all the browser vendors are in because actually Chrome, Mozilla and Cisco were three of the founding partners in this. And Apple and Safari joined. Does that mean that Apple and Safari specifically is going to support AVIF? Not necessarily, but at least we see the early interest. and I don't see why Safari won't keep a very close eye on that now there are some people out there who would make the argument that they won't. Okay. I get it. but the fact that they joined the consortium very early, I think is, is, Hopefully telling, of their interests. So that being said, there will be support, I think, longterm for open source, formats. although, there's, there's one particular format, sort of like lurking in the background right now, which is called the JPEG XL. And this is another open source format a couple of years ago. when the JPEG was celebrating an anniversary, I think it was the 25th anniversary actually. the, JPEG.org, the organization, put out a call for paper, to sort of see what was out there. See if people were interested in, in, sort of like improving the JPEG, as it stood, because again, 25th anniversary, the JPEG was a little long in the tooth and it's like, what else can we do? so a couple of companies came together. Seven submissions were made actually. Two were picked. And the two companies that, were selected, were Cloudinary and Google. And Cloudinary had played around with this one format called FUIF, which is a free universal image format. And, Google had been toying around with this one format called PIK. PIK was in the background working because they had also believed that the JPEG was getting a little old and can use some updates. Long story short, the two, projects kind of came together into one and it became like the JPEG XL. and it's been moving along. you could actually, play with it right now. but. Again, not to get into the entails. it's not been adopted by a single browser yet, but they're certainly working on it. And in the fact that I think you have two image powerhouses, coming together, I think there may be something bubbling, on that end, and the JPEG is, being touted as like the one format for all your needs. So imagine, potentially, the WebP with the added idea and support that would also be able to replace something like an SVG, which is very interesting because the SVG as a vector format has particular features that a raster format can never have. But the JPEG XL feels like boom. You know, they have that covered. And a few other features that, you know, I don't want to get into the entrails too, too, too much, but, it's, it's kind of fun out there right now, you know? Jeremy: [00:32:16] It's interesting because we've had the same formats in the browser for such a long time. We've had the JPEG the GIF the PNG, SVG, seems after all these decades, we're finally getting to the point where we might start seeing, new, new formats take over. Henri: [00:32:36] Yep. And you know, people have to realize that, there's, you know, there are so many things that, that go into, having, so many updates say in the last three or four years or five, part of it's like computing power, there's so much, that goes into, being able to, have these formats readily available. Early days of, of the web P one of the challenges was the fact that it was CPU intensive. But you know, again, you're talking about the WebP being 11 years old in, you know, in six years, CPU power can change quite a lot from a handheld device to, to your laptop. Right. So who knows what you know is taking place on the enterprise side. but you know, it's funny, you should mention the fact that the formats are all old. In, in my talks, I mentioned this quite often and just want to remind people that, you know, like you said, the GIF, the SVG, the PNG and JPEG. You're looking at easily, like a hundred years, which is crazy, you know, like I, I joke around in my talks that it's like older than the Rolling Stones, but, that's very important. We've had HTTP1 and 1.1 For like almost 30 years almost. And, in the last three or four, we've gone H2 and now we're talking H3. If you're looking at the early days of web, you know, no one knew that the web was going to be consumed, in greatest amounts on handheld devices, and handheld devices with moderate power. I think we're lucky enough that, we have like some iPhones and high end Androids and whatnot, but the average individual is looking for a deal and the deals happen with moderate devices and they have particular, CPU hurdles and, you know, we've needed to make some changes along the way. Formats being one of them and, protocols being the other, but that's a separate story. Jeremy: [00:34:38] I would think that currently a lot of the devices have dedicated hardware to decode specific image formats, like for example, H264 and, and maybe now that things like web P are gonna be in the browser, there would be dedicated hardware to, to decode that as well. Henri: [00:34:56] And, and these are things that will probably come along. but you know, you still have the fact that you have an absolute trusty in what I like to call the workhorse in, in the JPEG that's always going to be there. I mean the JPEG again, by far, the, the, the best support out there, it's the one that's being delivered, by digital cameras by, you know, Our phones. So I mean, there's less of a concern, but eventually yes, you do want some hardware decoding. Like, for example, when I brought up the, the AV1 I remember, a speaker from, a talk from an engineer at YouTube, talking about, you know, them experimenting with the AV1 and then realizing that something about like 10 to 12% of their users run very old devices and they had no idea. And so these are things that you have to take in consideration. And so, if they're finding out that they're on old devices through, you know, trying to serve, you know, next generation video, They're still going to be on these old devices being served next generation, still images. These are, again, part of these curve balls that engineers are being, pitched, quite often Jeremy: [00:36:18] Yeah, so it seems like this, I guess, dream that we had of. Hey, everything is going to be moving to web P or maybe everything will be moving to AV1 is probably not going to become true for actually quite some time, because like you said, there are still going to be a significant percentage of people who they need to use the old formats because of their old devices or because they're, they're buying low cost devices. So it doesn't seem like we're going to be able to get away from. The image tag that goes here, we'll give you the JPEG. We'll give you the webp and so on. Yeah. Henri: [00:36:56] Yeah, that, that, that's definitely going to be, you know, it it's like that rough transitional period right where, everything's being supported by everything you needed. And then now you have to get into that transition where it's like, okay, well, you know, maybe it's the hardware that needs to change. Okay. Now we have to make sure that people know that they should request a web P and things like that. So along the way to nirvana. we have to get a bunch of things sorted out so that, from a user standpoint to a developer standpoint, to like, a hardware, so yeah, I mean, it it goes past browser support. and obviously that's one of the hurdles that's without a doubt. but, once you have the browser support, that you need, you know, you still have this sort of like the few legs to, go down and, and make sure that, you can have that sort of like perfect situation where it's like, okay, boom. Now we have like the support that we want. We have the browsers, and then we have the education and that, that, that needs to take place. Jeremy: [00:38:07] We've been talking a lot about things that are, that are coming and I want to bring us back to some of the things that that developers can do now to improve performance or at least improve, perceived performance. And one of the things we had mentioned earlier, was the concept of image, lazy loading. Like, should we be loading all the images on page load? Or should it be as the user scrolls? And I wonder from your perspective, how should web developers approach that and what are the tools they should use for that? Henri: [00:38:43] I would say you, you actually should be lazy loading. I mean, ideally in an ideal world, you know, you don't, request resources that you're not going to see. and, a while back, I don't know if I could dig this up, but there was a study, sort of indicating that, something like two thirds of, of, of resources were below the fold, on average, and then, on top of that, only about one out of two, users went to the bottom of the page. So yeah, so you ended up having a bunch of resources below the fold that, quite possibly aren't going to be, needed. So the advent of lazy loading came about again, you know, wanting to make sure that you're not hampered by a page having to load say 10 megs of resources, just so that you can look at the, first sort of like page and a half of, of information. So that being said, lazy loading became a bit of a priority. And as you may know, Chrome has natively, added, lazy loading as of right now, I believe in stable, if not for sure canary, and obviously there's some libraries out there that would, that would help you out with, with, that process as well. And again, I get back to the idea of being disciplined as a developer. And making sure that, Hey, did you want to snack because, I can give you a snack or I can take you to all you can eat, you know? And the all you can need is what we don't need since you only want a snack. So, if that analogy made sense, but, but yeah, I mean, it's, it's super important, you know? I mean, for example, I think I tweeted this a couple of days ago. someone who's at an agency, sent me a, a site that they had just, pushed and it'd gone live and I'm like, okay, let me take a quick look. You know, it was like 11 megs on, on first load. it was like 99 images. 89 of them were lostless, and everything loaded in one shot. And it was a fairly long, fairly long page. I mentioned this to them in a quick communication, like, hey, there's a bunch of other issues, but you guys should be using some lazy loading, you know, because again, it's the idea of whether or not, a, a individual is going to go to right down to the bottom of, the page, and, on average that's not the case. And lazy loading is going to help you manage those assets. for the best user experience possible, Jeremy: [00:41:08] So with that particular page, as an example, it sounds like one of the things that they should do is conditionally load. different qualities of images based on the device you're using. And the other thing would be to do some form of lazy loading so that you, you don't load every image on the page, but instead you load them as the user gets close to them. and this is all being done just through HTML tags? Like there's no JavaScript? Henri: [00:41:39] So barring the native, implementation. you could actually there are a couple of ways that you can set that up now, prior to the native implementation of lazy loading, we were using the intersection observer API, and what that was essentially, it was kind of like, I describe it as like fake lazy loading. so you could actually indicate, you actually set up, through intersection observer, sorta like how far from the viewport you want particular assets to load, and that became, native to the browsers before Lazy loading. Now it had uses beyond images, but, people were starting to use that for images specifically. That was certainly available now on, you know, outside of using JavaScript, I can't believe you were able to set that up. however, you had mentioned, potentially using, Images, I think you'd mentioned images in a, low quality. Did you mention that at some point? Jeremy: [00:42:48] Yeah. So if you are on, let's say you're on a phone and the device size is small. Maybe you don't need that full 4K resolution image. so it sounded like there was some way within just using HTML that you would be able to, select different images for different devices. Henri: [00:43:11] Oh, so, we're talking about maybe using the, network information API potentially, we're getting into, I guess, JavaScript, not so much HTML, but, you know, you could actually start to conditionally, provide, particular, image qualities. Depending on the network conditions, obviously now, what that could be is, I don't know, let's talk about baseball, you know, you may have a front page where it's like major league baseball under a crisis, you know, and then you have like a bunch of players, like, on the page in an image, but that might be under ideal conditions, say 4G powerful phone, whatever. But just say they're under less than ideal network conditions instead of, of, of having the image of the players. You might throw up, the MLB logo. Three colors, made it at SVG. It might be two kilobytes instead of like the 43 that it was for the image of the players. So the net info API is going to allow you to do, stuff like that. Certainly. there was a point as well, and this was again, a JavaScript implementation where, I mean, if you do, what sort of people know from medium, where they give you that sort of blurred, image, and then it does this little swap to give you something a bit more, high quality. I mean, people have their say about that. Some love it. Some don't because it's actually an extra request. Et cetera, et cetera, et cetera. Facebook also played, aggressively with that where they're actually, I think their, their, blurred image was like one or two kilobytes or something. And then they would swap in the proper image that was more of a user experience situation because they wanted to let people know that, Hey, this image here coming up. You can stick around, but you know, that sort of blurred image gave the user just the information they needed to know. Hey, there's an image, I can wait the extra, like half second for it to load up. Instead of giving you like this blank page. Jeremy: [00:45:19] This has all been specifically with images. Are there any other, common mistakes, either on that particular page or on other websites that you often see? Henri: [00:45:31] I mean, in terms of, additional sort of challenges, like some of the challenges you'll see, it's sort of like, a bigger user experience, issue. So in terms of, prioritizing what needs to load, on the first load. So above the fold, and these are things that you're pretty much going to see, Once the page loads, but at a particular level so that's why the film strips are very important. So you can see sort of like a frame by frame level of what's loading on the page, and then you can make some adjustments. I mean, in terms of, another low hanging fruit, I'd probably say, the next one might be making sure that you keep your requests down. And again, that partly has to do a bit with some lazy loading, because at that point, you can make sure that, you're not loading, like say, you know, making these 300 requests in one shot when really you can just keep it at like a hundred, 150, I dunno. But you'll know only when you see the page load yourself. I think that's certainly important. And I'm talking about low hanging fruits here. I don't want to get into the deep, entrails that, that, you know the good people like Andy Davies and Patrick Meenan, get into. but yeah, I mean, these, I think are some of the, immediate decisions that you can make. Just making sure that, you know, your, your first load is as quick as possible. And that's by keeping some of the requests down and make sure that you're not, you're not stuffing, that first load with like a bunch of, you know, I shouldn't say needless, resources, but some resources are probably, on the fence on whether or not they should be right there. you could sort of, you know, have them load below the fold, and still deliver the information that you, you, planned on, on providing. Jeremy: [00:47:20] In the case of the first load. Would that usually be because when you first load the page there's scripts that are running, or there's images that are loading, that aren't really content, I guess, that aren't really text somebody needs to read and things it's like that. And those are the things that you would load, later, somehow I guess this is what you're saying? Henri: [00:47:43] So, it's a great question because this is the kind of conversation I've had with, with, designers, and which is why I always believe that designers should certainly be, aware of the performance conversation, because you know, they'll sit there and make this, have this page mock up and they're like, boom, boom, boom, boom, boom, boom. This is going to look amazing. But really and truly, I think part of that push pull is what is the most important here? what. what kind of asset, what information can we have come in below the fold? You know, what, what can come in a little later in a load? because again, all that research, is around the snap of the finger, making sure that page loads up right away, the user can kind of scan to see what's going on. And then at that point start to like make the way down the page. You do not want that page to load and have this sort of like blank space. And then that turns into that blank stare. And then, who knows what happens after that? you really want that information to be like, boom, cattle prod. It's right there. It's like, okay, they're scanning the headline, looking at a couple of photos and then they start to load up the rest. And that little bit of time as seemingly insignificant is sometimes a world to a page and when it comes to loading resources, Jeremy: [00:49:09] So that's almost bringing it to, being a part of the design is can you make a page where. you don't have to load all the sort of surrounding images and surrounding chrome, just to see the content. And even before you talk about performance, it's, it's more of a, when somebody first gets to this page, how do we make sure they see what they need to see? Henri: [00:49:30] Absolutely. Absolutely. And, you know, you bring this up just a week or two or week after, I've I've listened to, or watched, an amazing talk. it was, actually, with regards to cnn.com, during 9/11, you know, and it was phenomenal how, the sort of tech lead, talked about how the stripped the, page bare in order to just keep the details that they needed, because they actually had tried one version of the page, what they felt was, sort of like the essentials. And then they kept stripping. It kept stripping, it kept stripping it, and eventually it just became one image, a logo and text. Now, granted, you know, that's also because you're under so much sort of, sort of stress from all the traffic, but the idea remains that what do you need to deliver in terms of information? What is going to load as quickly as possible? That's what we're going to go with. In essence, almost 20 years later, you know, we're still dealing with that because now, even though we don't get that sort of like 9/11 level stress to a site, but you still have the fact that you have devices and varying networks and you know, it'll never translate again from like, that kind of traffic to, you know, a network being that throttled. But the fact remains that, we can predict that they're going to be on their device. It's going to have various, a network connectivity, varying power as a handheld. And so you try and do the best to manage that element. And the fact that you do have like a team of developers and designers who have sometimes, dueling opinions. Jeremy: [00:51:27] As a user, and maybe this is more common amongst developers, we sometimes like to see the lite pages, you gave the example of CNN or NPR, where they just give you the content. And, those are often in separate versions from the normal site, but you're sort of saying, can we take some of the lessons that we're applying to those, those lite sites and just apply it to our design in general? Henri: [00:51:53] Absolutely. And, Someone who had heard me talk about the idea that lite sites existed. and I brought up the fact that lite.cnn.io, is upright now. And you can go check it out and it'll have the same information as cnn.com, but CNN.com will have all the media images and videos and, and X, Y, and Z and ads. Whereas the lite site is (pssh) text, not a single image. And apparently, this became a bit of a standard after 9/11 and that site is up all the time. And again, I keep mentioning 9/11 because obviously that was like a very unique situation, but the fact remains that in times of hurricanes, any kind of like meteorological, crisis, or anything like that. Storms, whatever you want. These sites are still going to be very important because that infrastructure is going to be down and you won't be able to access sites as, as readily as, as possible. and, the fact that you have a site with just bare bones info is fantastic. Jeremy: [00:53:00] Another thing I want to talk a little bit about is, Chrome being slow or being a memory hog is kind of this running joke. And I'm wondering in your perspective, is that the fault of Chrome or is it the types of applications and websites that developers are putting on it? Henri: [00:53:20] I mean, it, this is mildly above my pay grade. you know, I keep it a number of windows open as well. there's probably a little bit of both going on. and applications have never been more demanding that's without a doubt. and in turn that's a testimony to the fact that browsers have never been more capable with a bunch of features, you know? I've been a big proponent and fan of, browser engineers because, I should pull this up, but someone, on Twitter said that the browser is by far. the greatest piece of software out there. I think there's a strong case for that to be absolutely true. I've gone out and said that, on your phone, through your browser, you could probably please just take care of everything. You need to do banking. you can save money, you can go out and make money. I mentioned that you could find a date, buy clothes for that date, rent the car for the date, order food for the date, all through a browser, and if you had told us that like 10, 15 years ago, people would be like, yeah, whatever. But in 2020, today, July 19th, it's happening. And, browsers have to make these sort of, accommodations, you know, so, whether or not Chrome is to blame in terms of like being a memory hog, I'll let you know the engineers debate that, but I think you must tell the story that the browser is basically a bit of like a Swiss army knife right now. We have made demands, from the browser that have generally been met. And, I think we've benefited from that, 10 fold, and, and again, If you didn't have a laptop or anything like that, the browser on your device would be able to, to save you. And, you know, we could have had this, this conversation on the browser, on the phone, let alone that, you know, the fact that we actually are having it on a desktop right now, you know, so, you know, kudos to the engineers out there. I'm not slamming you guys. Jeremy: [00:55:31] Yeah, it's, it's pretty, it's pretty incredible just what you can do in the browser and the fact that it has become so extensive that now we have electron right. Where people are making websites basically designed to be desktop applications. Henri: [00:55:47] Yeah. I mean, I don't know if you saw someone, yesterday the day before. I think recreated, either MacOS 8 or some Mac application in electron. It was pretty fascinating, you know, so, but yeah, absolutely. Jeremy: [00:56:02] And what makes a lot of this possible is, is JavaScript, right? And there's a lot of different JavaScript frameworks that people use, like React and Vue and so on. And from your perspective, what are the additional things you should be thinking about from a performance perspective when you're working in, in heavy JavaScript, code bases? Henri: [00:56:25] You know, on demand. That's it, you mean the world without JavaScript although it can, it can exist, you know, it would be challenging. and I think we have to accept that it's definitely here to stay and it's here to, I mean, I shouldn't say here it's here to stay. I don't think it was ever not invited. I just think that, there was a sort of like this liberal use of JS and, you know, without really under standing how powerful it was and, how caustic it could be a to the experience, the user experience, Thank the Lord for people like Alex Russell, who are out there to remind us of the fact. But that that's, that's really it. And I think that, as, more and more frameworks come around, the idea is to not really. shun, their availability, because again, you know, people are spending some time to research and create a library, or a framework that they feel that they need. It's when it's being used and deployed. can we use it as efficiently as possible? That's it at the end of the day, name the resource. Someone's out there, using it liberally. Can you make sure that, in employing this, this resource, you could just, send it down the wire. Or make use of it in the most disciplined fashion. And that's what it comes down to in the end. And that is the research that needs to take place. when, using these resources, especially JavaScript, like I said, and like you mentioned, between its availability and everything that it can do for us, Jeremy: [00:58:05] You, you talk about being in dev tools all the time. Are there specific parts of dev tools or, or tools outside of dev tools that you use to try and identify areas where there's performance problems? Henri: [00:58:20] So, Dev tools I mean, auditing a site can seem a little boring at times because. You are essentially looking for very, specific details, I personally love dev tools because, as scary as it can look at times, and there's still many parts of it that, you know, I tend to forget, or I get lost, in, but, It will provide a lot of the information that you need, on a pages, health, and, and what's taking place, on the page now with respect to tools in particular that I like, again, I'll say it, I love dev tools. Obviously, it would be impossible to, do a, proper performance audit without having something like web page test handy, webpagetest.org, a product of, the great mind of Patrick Meenan a tool that's been called the, the Cadillac of performance. And, and again, not the prettiest to the naked eye, but, a treasure trove of, of detail and information. And, Patrick Meenan has done, incredible work in, in making that available. so yeah, webpagetest.org is certainly, something, you need to keep around. I'm going to throw in, lighthouse. And I say that specifically, because, if you take lighthouse version one, two, right through to version six, there's been very considerable amount of, of work and research that has gone into what we are seeing right now in lighthouse, in terms of the information that you provided, the recommendations, and, the links they provide with the recommendations, they'll tell you, it's like, Hey, we see. In these sort of like 10 resources that you might be able to save X amount of, of kilobytes of megabytes, even, if you do this, and that has helped in what I believe, is yeah, a maturation in, understanding performance. And what you're also seeing right now, through lighthouse is people working, to reach the, mythical score of a hundred. you know, you have people sharing. Their performance scores and all the other audits that, it takes in consideration, sharing them on Twitter, you know, saying that, you know, I have 89 right now. I'm totally working on the rest. This was not happening like prior to lighthouse. you know, having a, performance conversation was kind of like speaking to like a wall of bricks, but now, people are just, openly and willingly sharing, you know, their scores, which means that they are looking into ways of improving, their sites performance, whether or not they get into the deep entrails, separate story. But, Google has been able to create this, platform where, Not all the low hanging fruits, but low to mid will sort of give you a good idea of what it's like to look after performance. So, I would say certainly dev tools, certainly webpage And, and again, I think lighthouse has been, very important in sort of raising the bar of performance. Jeremy: [01:01:48] Yeah, I think lighthouse is interesting because it almost gamifies it. You get this score, like you said, you can post it in a tweet and say like, Hey, look at, look at the score I got. There's also a target, for people who make frameworks and things like that. Where, if you have a static site generator, you could say, Oh, the default template that you get from my generator, it's going to get you this score in lighthouse. So I think it's, it is very powerful to have that, that target or that goal that everybody can see. Yeah. Henri: [01:02:23] Yup. Absolutely. And, it has been gamified and, I know that term is used, quite often. and I definitely agree. I mean, I don't want to make it sound like it's. Like a game game game. but it is something that people have been able to sort of like, you know, look at and say, okay, this is where I want to be. You know, like this is the 10 second, hundred meter. That's how I qualify. I've been running 10 fives. We're going to get it down to 10 soon. So, I, I'm seeing these, these conversations come out of, of individuals. I least expected to sort of, talk about, when that kind of adoption, takes place, I think it needs to be sort of, discussed and to an extent, applauded. Jeremy: [01:03:07] I think in, in your work, it seems like you have a pulse on a lot of the different APIs that are available in the browser. Are there any that you think are being under utilized things that people could be using that they aren't? Henri: [01:03:24] The thing about some of these API APIs, quite a few, are likely being underutilized, but what is underutilized ultimately? I think the bigger picture is, can you look at your personal goals with this particular site? are they being met and if they're not, what are the avenues. To you meeting this goal that you have in mind. and then you'll step back and look at all your options. And, you know, for every particular API that's available there's probably a bit of a downside, some of it might be sort of like complexity of use, you know, cause some are just not used quite often because it requires a bit more work and sometimes people are like, Oh, I don't feel like spending an afternoon having to do this. You know, I get back to the idea around animations, right. it could be the animated GIF. It could be potentially an MP4 it could actually be. animated CSS, but that CSS, the SVG animation is going to take a bit more work and you're like, eh, or are you going to just jump in and just say, Oh, F it, I'm just going to use this GIF for this animation, even though by the naked eye, you could tell that it could have been done in CSS, right? So the same thing is happening with particular APIs where it's like, eh, using it on a page. It's probably going to take a little bit of work, which means if there's a mistake, it's like, I don't want to go in there and having to do all this, like, and debugging X, Y, and Z. You start to look at some of these other options that might be a little easier. And then that might even mean that you might rewrite a particular part of the page cause it's like, eh, you've suddenly discovered that, you know, you could kind of strip this out and make it a bit more bare and make it easier overall to sort of manage. so, that's part of the challenge with APIs. In fact, I remember, I think it was Steve Souders who said that user timing API. So, yeah, it was the user timing, API, a great tool. developer adoption, not so great because it just meant that there was a bit more work taking place. And a lot of developers, they found didn't want to get into that. And so that's why you start to have some of these other tools like speed curve. That'll sort of like show you what needs to be improved and where you can sort of, make some of these improvements. and hopefully, without the sort of, challenges of having to sort of like refactor, some code, when most developers don't want to. Jeremy: [01:05:58] Right. I think it's sort of a general belief that you only do the things that you need to, and, if you have a simpler way of doing it, where you import some package from NPM, let some JavaScript library, do it for you, then you'll do it until it becomes a problem. Henri: [01:06:15] Yep. Absolutely. And that's where, you know, the whole idea of, I'm sure you remember this sort of a sudden push for a vanilla JavaScript because people were just jumping in and grabbing all these libraries to do some of the simplest, work and people importing in particular library of, you know, like a hundred, 200k library could have been like 30 K of just vanilla JavaScript. You know, so, as you said people just don't like to do the heavy lifting Jeremy: [01:06:42] Yeah, I mean, I think we can all identify with, if we don't have to do the work, then we don't want to do the work. Henri: [01:06:49] And that's it. I mean, and I never fault developers for that. That's where we're at, you know, trying to make sure that people can sort of do a bit of the work and get some the reward without going heavily into the entrails. Jeremy: [01:07:02] For sure. I think that's a good place to start wrapping up, but are there any other things you, you think I should have asked or anything you thought we should have talked about? Henri: [01:07:13] I mean, not really. It's just, you know, again, this is a, it's just a fun conversation and, hopefully not to the audiences detriment sometimes, you know, my mind starts to run in these different directions the minute, make a one statement and I'm like, Oh, oh, oh. I kind of want to mention this too. I mean, we did spend a bit of time on the images side, but, it, it is something that I've found fascinating, over the last few years. And, you know, I've followed quite a few, developers and engineers who have, been deep into that research. So anyone who wasn't into images, I certainly do want to apologize. But, if you are, I hope you enjoyed it and enjoyed that little, the triggering from my, my Bluetooth. That's telling me that, dude, it's kinda like Oh man, what's his name? the comedian he had the wrap it up box. Jeremy: [01:08:03] Oh, really? No, I hadn't heard of that. Henri: [01:08:06] Oh man. why am I forgetting his name? Oh, anyways, whatever. Jeremy: [01:08:10] Yeah, it's trying to play you off the stage. Henri: [01:08:12] Exactly, exactly. It's like the big hook keeps missing my neck. But you know, again, there's so many things around performance that that, can be discussed. You know, I recently listened to a podcast, which featured an engineer from Facebook and they were talking for an hour about, HTTP3 and QUIC and I thought that was fascinating. And, and that's some of the heavy duty lifting that's taking place, But. For very obvious reasons. And it's certainly going to play well into the future, especially, this future that we have right now, that's going to be heavily laden with media, you know, images and certainly video. For obvious reasons who are working from home now, students are going to be learning from home. So there's going to be a lot of streaming taking place. And that's the, the next hurdle in dealing with media management and dealing with performance. Jeremy: [01:09:02] Yeah, I think my hope as a developer is that these new things, whether they be HTTP3, or they be the new video codecs new formats, I'm hoping that we get APIs or get supporting libraries that makes it almost transparent to us. That makes it so that, I just say, I want to post this video and somebody else, like probably the people who are on that podcast have done the hard work of making sure that goes over the right transport. Gets encoded in the right video format, split up if it needs to get split up that sort of thing. that's, that's my hope anyways. Henri: [01:09:43] Yeah. I mean, it's certainly everything that we'd like to see, because, as it becomes a seamless and like you said, effortless for us, ideally, that can be sort of, also, transmitted down to, the user. I mean, because ultimately they're the ones, having to, you know, follow along, the online class. They're the ones who have to do the, live, remote, you know, yoga, you know, workshop or whatever it is. I think there are a lot of discoveries that are going to take place in the next little while, because, the, the fact that our world is going to be, about us consuming a lot more video content. This was predicted to be happening two to three years from now. Obviously this pandemic, basically, cut that into like a third, we are more than ever dependent on the transmission of video across the net. And I think it's incredible that even our conversation is taking place right now in such a clear and seamless fashion, back to school, is about like a month from now, we're about to see the net just they're going to be pulling carts, you know? And, it's very interesting to see how that's going to take place because you know, we're talking about the pandemic hitting like mid-March and there was still confusion as to how this was going to play off of the rest of the year. Now it's like, we're getting ready for what might be an entire year of remote everything. let's see what happens. Jeremy: [01:11:16] For sure. This is another big test for the web. Henri: [01:11:21] Yep. Totally. You know, the web is just there, like wiping its forehead, like, Oh my God, like what just happened? And how am I going to make it, make my way through this? So, no man, it's, it's super interesting and I'm sure we're going to, you know, maybe a year from now have another discovery, and some engineering feat that's hopefully going to make things better for everyone involved. Jeremy: [01:11:46] For sure. If people want to see what you're doing next, I know you do a lot of conference talks and things like that. Where should they check you out? Henri: [01:11:55] I would say exclusively, Twitter, you know, I'm a big fan of the platform, so, Henri Helvetica. So H E N R I and Helvetica, as you know it to be spelled. I'm planning to have this 1.0, 2.0, release of work that I've wanted to do probably in September. So, I've shared this privately with a few people but, a blog is coming. I'm probably going to do a series of short YouTube videos talking about little things from browsers to a short podcast on performance, but a light conversation again, as we just did, but probably a little shorter. But yeah, Twitter is certainly the place to find me, whether it be, in public or in my DMs as well. Jeremy: [01:12:44] Awesome. I'm looking forward to seeing the YouTube channel and the podcast. Henri: [01:12:49] Absolument and Jeremy, man, I definitely want to thank you for A) Your patience. Because like I said, my AV hurdles, weren't the most fun. So this is a conversation that's long in the making and thanks for having me on the show. Jeremy: [01:13:04] Yeah, thanks for coming on. It's been a real pleasure Henri. Henri: [01:13:07] Merci Merci. And when I get my thing going, I'll be sure to ping you to have you as a guest. Cause I'll tell you about this other little project I have as well. Jeremy: [01:13:17] Nice, I'm looking forward to it. Henri: [01:13:19] Yeah, man, it's going to be fun. Jeremy: [01:13:20] That's going to do it for this episode. If you're interested in learning more about web performance, we have all the tools and APIs we discussed in this episode, in the show notes. And if you have questions, I'm sure that Henri would love to hear from you. The music in this episode was by Crystal Cola. Thanks again for listening and I'll see you next time.

Aug 12, 2020 • 1h 19min

The battle for your privacy on the web with Pete Snyder

Pete is the senior security researcher at Brave Software and the co-chair of the W3C Privacy Interest Group.We discuss:The differences between academic research and product developmentHow websites track us with cookies, fingerprinting, and other techniquesThe surprising amount of data your browser gives up without a permission promptWhat features should go in the browser vs being native onlyThe confusion behind what incognito or private modes doThe role of EasyList and the underappreciated people behind itBuilding tools like PageGraph to identify and rewrite tracking codeReplacing resources at page load to preserve privacy without breaking websitesDeveloping web standards at the W3C while preserving privacyBrave's plan to fund websites with ads instead of paywalls while preserving user privacyGetting involved in privacy and web standardsRelated Links@pes10kPersonal SitePrivacy Interest Group (PING)W3C Privacy Community GroupResearch at BraveSpeedReader: Fast and Private Reader Mode for the WebDetecting Filter List Evasion With Event-Loop-TurnGranularity JavaScript SignaturesEasyListuBlock OriginPanopticlickProtecting Against HSTS Abuse (WebKit)What’s the Difference Between First-Party and Third-Party Cookies?Understanding Redirection-Based TrackingMediaDevices.enumerateDevices() (Specifications give server all devices and labels without asking for permission)CSS difficulty with adoption (CSS2 spec was written without an implementation causing problems)What Are CSS Vendor or Browser Prefixes?FPRandom: Randomizing core browser objects to breakadvanced device fingerprinting techniquesBrave Fingerprinting Protections v2: Farbling for Greater GoodPageGraphPuppeteerTarget: You Can’t Hide That Baby Bump From UsBrave Ads FAQLUMAscapeMusic by Crystal Cola: 12:30 AM / OrionTranscriptYou can help edit this transcript on GitHub.Jeremy: [00:00:00] In this episode of software sessions, I'm talking to Pete Snyder about the many ways websites track us. How ad blockers like uBlock Origin work. And the process of developing web standards with privacy in mind. We start by discussing his role as a senior privacy researcher at Brave software Pete: [00:00:18] Brave is kind of interesting or unique as a startup in that we have a proper research lab. I think our research team is seven or eight people right now. Those are people who do research in the form of published publications but also doing research that ties back into product in some way.My research responsibilities are to figure out new ways that you can improve browser privacy, address tracking on the web, and solve the kinds of problems that Brave is interested in solving. I have one foot in engineering world and one foot in publishing world.Jeremy: [00:00:48] Why is academic research important in this space?Pete: [00:00:52] My gut feeling is that what's useful about academic research is that it changes the incentives and it gives you a chance to do things that are more novel and particularly things that are less tied to a short term ROI cycle. That is particularly useful for things that have watchdog functions over industry, things that are more difficult to monetize but more useful to average web users.That's not to say there aren't people who try to build businesses around privacy or responsible computing but the incentives don't always work that way. What's really neat about doing a research focused computing career is you can do things that don't have to make somebody money in the short term. You can pick more oddball projects. The things that might not come to fruition right away.Jeremy: [00:01:36] And is there a key difference in how you approach a problem when you're doing it in an academic context versus as a product for a company?Pete: [00:01:46] Sure. So they go both ways. If I'm working for something at Brave the emphasis is on correctness and certainty. And knowing that when we ship it to 10 million people or whatever that it's not going to break and it's going to do what it says on the tin and that it's going to be a material improvement over the state of things before we ship that feature.And that's really different than if you're trying to come up with a research project where.. sometimes good, sometimes bad, but the emphasis is not necessarily on a hundred percent correctness but is on novelty and doing something or figuring out some way to solve a problem in a way that it hasn't been tackled before.And so you'll read research papers that say it works 95% of the time and that'll be sufficient or compelling for a research paper. But you wouldn't want to ship something that breaks 1 out of 20 websites if you're actually making a product. The goals are different, but also the success criteria are different.Jeremy: [00:02:39] So it sounds like you can tackle things where it wouldn't be good enough for a product yet. But it's something that if you were working on it within the context of a company, they might say: Oh, we're not going to do that because it just doesn't seem like it's going to work.Pete: [00:02:54] Yeah, exactly. So, maybe because certainty of success isn't there, or there isn't a one or two step obvious path to being a product. Maybe it conflicts with the current business goal or whatever else. But yeah, you have much more latitude in terms of products you can choose and kind of problems you want to tackle.If you're writing research papers and not that I'm some incredible researcher or anything, but if you try to do successful research it doesn't reward you to solve that final 5% of the problem.There's no benefit to getting no, not none, but there's a small benefit of going from 95% to 99% success or accuracy. On product you have to grind out as close to a hundred as you can get.Jeremy: [00:03:37] And do you have examples of things where you worked on it in a research context and it actually became a part of a product?Pete: [00:03:46] Sure. Yeah. So a couple of things. One is that.. so there's a research paper that we wrote at Brave called Speed Reader. Speed Reader is a different way of doing a reader mode in a browser. Right now, if you use any of the reader modes in popular browsers, you download the page, you render some subset of it.You throw some JavaScript at it and it extracts sections that it thinks are useful, then it presents a new page to you. That's not a hundred percent correct. Chrome's DOM distiller does something slightly different, but to approximation you render the page and then you extract stuff out of it. Brave speed reader does something different.It intercepts it at a network layer. It examines the text HTML. Does the analysis there and then feeds that back to the rendering engine. And so there's a bunch of nice benefits there. There's a privacy improvement in that you're executing less code talking with less third parties. There's an performance improvement as well in that you don't have to do the initial displaying and tear all that stuff down and build it back up. So that was the research paper that we published at. WWW 2018 2019. I don't remember, but either a year or two ago. And it's now in beta in Brave. That's maybe the oldest one. The most recent one was a project that I did last summer with a student from North Carolina, Quan Chen on figuring out ways that we can do blocking better.Right now, if you're using a privacy tool in a browser in most cases you're downloading a big list of things that should get blocked. They look kind of like regular expressions. It says: Yes, block this. No don't black that, and it's a useful thing.But it has the trade off of it's very easy to circumvent that. Somebody can just change the URL [and] move it to a different domain inline in the page, whatever else. And so the approach that we took from this paper is.. Let's not focus on the URL. Let's build signatures of the execution paths of these scripts and we can use that as the Oracle to identify this is known bad or known good. That machinery ended up being very complicated and it isn't something we want to ship to all of our users because of the performance hit. It's something that we use for generating filter lists that we should download to users regularly.Jeremy: [00:05:40] Existing projects you were saying are just looking at a list of URLs. And you said using something like regular expressions to figure out if the URL it's pulling is on that list. The part I wasn't clear about is the new way that you were describing worked.Pete: [00:05:57] Yeah. The alternative approach that we came up with is to instead.. Not care about where the code came from or even how the code is structured. So if it's obfuscated in some way, but instead to look at the DOM and JavaScript operations that the code executes and sequence those and use that as the identifying signature of the code.There's some cleverness in there that makes that particularly difficult to do in JavaScript versus other languages. But at a high level, it was saying let's identify things based on their behavior, not on their source.Jeremy: [00:06:28] And so would that be where the browser would have to load the script.. see how it would affect the DOM. And then based on that, you would determine whether or not this was something that was probably showing you an ad or trying to track you, that sort of thing?Pete: [00:06:43] Yes. The way the project works toe to tip is.. There's these long, long, long lists of things that people previously have identified as being tracking related or ad related those are things like EasyList and EasyPrivacy and the uBlock Origin lists and all this kind of stuff.And so you can throw those at the web and you get some kind of nice labeled data set of these things are tracking and ad related, these things are benign. So you can run those, execute those files or load those pages and get signatures of how that code operates in the page. And so now you have your ground truth signatures of this is what known bad code does. And this is what known good code does. Then you can run that stuff against a bunch of things that you don't know the labels of and you can rebuild those labels on top of this code that people have examined before. And so you can do a couple of things with that. You can either use that to build even more lists in an automated way. You can use it to do code rewriting since some parts are good and some parts are bad. You can use it for online blocking things like that.Jeremy: [00:07:38] You're basically looking at things that people have identified as being bad behavior or tracking behavior. And you can load things that you haven't seen before and use that to instead of having a human curate the list you could have your code load things that it hasn't seen before and figure out.. Oh, this looks like this thing that I've seen before that somebody said was bad. And so I'm going to make a new list based on that.Pete: [00:08:08] Yeah, exactly. For the Alexa 1000 or Alexa 10,000 like the most popular sites on the web, those have a lot of eyes on them. And so things that are tracking related get picked up pretty fast on those. But for the long, long, long tail of the web, that stuff is barely examined or at least, has a lot less eyes on it.And so this is a way that you can use the the code that people have looked at to identify code that fewer people have looked at.Jeremy: [00:08:32] On a broad sense, how deeply are people being tracked and do you think people are aware of just how deeply they're being tracked?Pete: [00:08:41] So in the first case unimaginably. The amount of web surveillance and offline surveillance that people undergo.. unimaginable. Large amount. And then the second case, very little. You'll find these tools like Brave or the new version of Safari or Adblock plus or any of these.. uBlock Origin.Good tools by people who are sincerely trying to reduce this stuff. And they'll put a little number in the URL bar and it'll say 10 trackers on this page or whatever. And you'll go to some new site and it'll have 95 or whatever. And that's just the known bad stuff. I think people have very little understanding of how atrocious the situation is.Jeremy: [00:09:24] And what are some of the ways that people are being tracked by their browser?Pete: [00:09:29] Oh. Well. In most cases, the tracking isn't being done by the browser. That's not necessarily the case. Chrome absolutely does observed things about what you're doing and sends it to the Google mothership. But in general the tracking isn't happening because of the browser itself.But rather by things the browser is loading because of things the web pages tell it to do so. So there's a whole long tail of extremely boring everybody understands kind of things that have been around for 20 years to like more weirdo stuff. By far still the most common method people are tracked is just I drop a cookie. I drop a cookie on one site. I fetch the same image on another site. And so the cookie gets resent and that's my way of learning the same person visited site A and site B. Web browsers like Safari and Brave will never send third party cookies with a very small number of exceptions. Firefox and Edge have a kind of complicated system for determining when they send third party cookies but do a good job of not sending them to the worst offenders. Then things get slightly more sophisticated.Instead of dropping cookies maybe what you'll do is throw storage into other places where people don't usually look for it. Right now there's at least four or five different APIs, six different APIs you can use to have persistent storage on the browser through JavaScript. Then there's a whole long tail of ways that things can get cached that also turn into persistent identifiers. That's maybe the second weirdest or second most understood. So then there's a whole bunch of places where the browser is implicitly keeping universal global state that you wouldn't necessarily think about as being a tracking vector, but anytime you have global state, you have the mechanism you need for tracking.And the most frustrating example is something called HSTS tracking or HSTS cookies. It's an abbreviation for a header that a website can send you that says, Always automatically upgrade this request to an encrypted version, to an HTTPS version, even if I requested over HTTP.Just in general what would happen is.. I make a request to some website I like, and it's going to be HTTPS. HSTS instructions are not respected over HTTP generally. But I make a request to a website I like it sends back this HSTS instruction that says good. Now that we have a secure conversation, I want you to never ever not communicate with me over a secure channel. So we got this one secure communication. We're going to use this as the kernel of trust to build the rest of our communication over.And so, that instructs me every time I visit the site again, automatically the browser will know add an S to the HTTPS. And the same thing is true for any sub requests or in general, until people started coming up with counter measures, the same thing is true of any sub requests as well.Jeremy: [00:11:59] If I understand correctly, you make a request to a URL and it tells your browser, in the future you try to go to this URL and you don't put in HTTPS to automatically go to HTTPS instead. And the part that I don't quite follow is how is that used to uniquely identify you?Pete: [00:12:19] Oh, okay. Step one is I make a request to your website it's a secure connection. Then on your website you have 26 different images. You know, an a b c d an e and so my browser will make new requests for each of those images. And those images in this configuration are each hosted on different sub domains on your website.A image off a.you.com that kind of thing. So now, if those are all requested over a secure channel, your server can decide I'm going to send the HSTS instruction or not for each request. So I'll get back to those images. And, I'll have, more or less 13 new HSTS instructions.And those will be different for me than they will be for you for anybody else. Just flipping a coin enough times that's like the setting, the identifier step. And then now a week later or whatever you want to identify me again. I clear my cookies and everything, so I think I'm not identifiable but I come back to your site and now you'll have me request the same 26 images, but over HTTP, not a secure channel. And now my browser will upgrade more or less 13 of those images. Your server will look to see which 13 images got upgraded. And now that'll be unique to me versus everybody else in the world.Jeremy: [00:13:28] I see. Wow. So it's a feature that had good intent, but the way that people are actually using it is they're building like a fingerprint right? They know for each URL which one they told you to upgrade to HTTPS and which ones not.And like you said, so even if you clear your cookies or whatever the URLs that should be upgraded based on HSTS they're still stored in your browser.Pete: [00:13:56] Yep. And so there's a long tail of these kinds of things where they were added to the web platform or to browsers or to the internet infrastructure for largely completely benign or really desirable reasons. But because of the way they've been implemented or because of the way clever people have misused them they become tracking vectors.HSTS is not at all the only one. It's just the one that is kind of the most galling because it's supposed to be helping people's security and ends up hurting their privacy or can hurt their privacy.Jeremy: [00:14:24] Right. So in the past you were talking about classic tracking that was making use of cookies. and that's something that gets stored in the user's browser. And my understanding is that for cookies in the past, you would go to a site and in order to track somebody while you're on that site you're making a request to another domain, right? To a tracking domain. And as you go from site to site, those other sites use a cookie from that same tracking domain. So that's what you consider a third party cookie. Is that correct?Pete: [00:15:00] Yeah. So I would go to your site. Your site is benign.com your site includes an image from tracker.com. I get the request back from tracker.com. It tells me to save a cookie from tracker.com. I go to some new website, it also requests tracker.com and now tracker.com can link those, those views.Jeremy: [00:15:16] And so now you were saying a lot of browsers like Brave and Safari and Firefox are starting to block third party cookies. What are some of the ways that sites are working around that?Pete: [00:15:30] So there's a long tail. Some people have been moving to these more esoteric tracking things. So HSTS tracking is one thing that people in the wild were doing particularly against Safari when Safari started blocking third party cookies. They've been moving to different types of identifiers in the browser.So maybe I don't store something in a third party cookie, but I set a cookie on the first party cookie jar. And then I just append that to my request. So things like Google analytics do that. That's called riding on the first party cookie jar because even though the code is executed by say Google analytics or any number of other tracking scripts. It's actually living in your cookie jar, the cookie jar is associated with your origin. So those are two and then it just kind of gets more oddball from there. So there's, browser fingerprinting, if you're familiar with this, which is ways of finding like things that are semi identifying or semi unique in your browser, then build up enough of them. Very quickly, I can identify a large number of people uniquely. It's like, guess who, and you just split the population in half enough times, so that's done extremely commonly on the web. It's very, very common. And then there's a new kind of thing called bounce tracking. Not, it's not that new but increasingly common kind of thing called bounce tracking where, Different browsers will only let you set third-party state if you visited them in the first party context. And so websites will play these games where they'll just forward you through a long number of first parties before you get to where you want to go.And now all these things can set third party cookies on in iFrames and things like that. I could go on and on. There's an endless number of ways these things are done, But getting rid of third party cookies is definitely an extremely helpful thing, but it's not the end all be all of web privacy.Jeremy: [00:17:05] One of the things you mentioned was browser fingerprinting what are some of the ways that, people's browsers get fingerprinted?Pete: [00:17:12] Sure. At a high level browser fingerprinting is looking for a bunch of things that are going to be different between people's browsers. They don't have to be unique to a single person, but there'll be minor configuration differences or subtleties that are different between my browser and your browser and somebody else's browser.For example, English is a very common language that's spoken on the web, and so if I know that's someone's language, maybe that identifies me. 1 out of 20 people speak English in the web or something like that. Yeah. Some number that's less than one or less than a hundred, I mean, a hundred percent.And so then now I've cut out a large portion of the web and now I look at and see, is this person running a Mac? That's going to also cut the search space down. And I look at this person, how many devices do they have plugged into their machine? That'll shrink it down further. What are the kinds of peculiarities of their graphics card when their different drawing operations does it draw a line in a slightly different way than on a different graphics card.That'll shrink it down further. Does this system have odd fonts installed. It'll shrink it down further, et cetera, et cetera, et cetera. And if you have enough of these kinds of things, you can pretty quickly put a lot of people in a bucket of size one.Jeremy: [00:18:20] And one of the things you mentioned is that you can identify what devices are plugged into your computer and maybe what your graphics card is. What are some things you think people would find surprising about how much your browser is actually sending to the server that you're visiting?Pete: [00:18:39] So it depends on the browser that you're using. But that's a good question. I expect people to be surprised that any webpage that you don't trust this is without a permission, websites can in some browsers enumerate all the devices you have installed, like the labels, the type, this kind of thing.They can learn about what kind of network connection you're on, whether you're plugged in. If you're in Chrome, the most popular browser you can learn about, the kinds of network errors the person is observing, which can be very identifying if you're moving between networks and they have different kinds of DNS configurations. If somebody has a two factor authentication device, like a hardware key, you can learn some things about the hardware key, like the strength of it.Even if you wouldn't expect a website can access that automatically. Those are just things the browser is intending for websites to be able to access. That's not like cleverness. That's just like, there is an API that will tell me specifically this. And then there's a long list of other things with a moderate amount of cleverness you can figure it out as well.Jeremy: [00:19:36] I think a lot of people are familiar with the fact that when they go to websites, it'll ask for permission to use a microphone or to use GPS, things like that. For some of these other things that you're referring to, that it's able to figure out such as your, your devices that are connected, things like that.Is that something where the person is giving permission or is that something that just just happens?Pete: [00:19:58] Yeah, so that, that just happens. So I represent Brave on the W3C and I co-chair, one of the privacy groups on the W3C, the horizontal review group that reviews specs for privacy. And that's something that we're working with the working group that authors that spec, the media and capture media streams capture group to improve that API.But that is just the way the API works in the standard right now. The website, says.. tell me about the devices the machine supports. It gets back a list of every device that the machine knows about that has like an AV component. And then the website now says, okay, I would like to access this one.And then you get the permission dialogue. But without any permission you can learn all the devices and all the labels and the categories and this sort of thing. I should say that the spec looks like it's getting much better. That working group has been really, it has been good to work with and it's been, been really receptive to those concerns, but that is that's the way it ships right now in Chrome, Edge, Brave makes some modifications to it. I don't know about the other ones, but that's the way the spec standard is written.Jeremy: [00:20:59] And that's interesting because you're saying the standard is currently being written but a lot of these different browsers, they already have an implementation of it.Pete: [00:21:09] Yeah. So this is another one of those good intentions that have turned out to have an unintended consequences kind of situations. So it used to be the case that you're running a web standard. A bunch of people would get together and work on the standard. And then when it was done, people were supposed to implement it.And then, this is a rough history. I'm sure I'm going to get some of the details wrong but to a rough approximation something like CSS2 happened where you ended up with a standard that was basically unimplementable and it had all these kinds of subtleties that hadn't actually been worked out because nobody had implemented it yet.There's other cases too and CSS2 might not have been the tipping point but it was definitely a famous example. Then there was a CSS21 standard that came out when people started implementing it, it had to get revised in certain ways to make it actually work in the world.Hand-waving simplification, but people thought this is not great. We need to like, actually build these things as we're talking about them, to make sure that they work in the world. And so then you got into this kind of like prefix situation where I don't know if you, if you do.If you're a web developer, you'll be familiar with like, until pretty recently, you had all these pre-fixed extensions like rounded corners, but WebKit rounded corners, and Microsoft rounded quarters and Mozilla Yeah. And you had similar things in, in the DOM.And there's still some hangover at where a bunch of specs in most browsers are implemented twice. Once WebKit prefix, things like that. And so understandably people thought this is not great. Now I have to write my code four different times. And so, right now, if you're trying to get a standard finished in the W3C, I'm less familiar with other standards organizations, but in the W3C, you need to have two working implementations, two independent implementations.They don't necessarily need to be shipping like unflagged like the way it's supposed to work in the best cases like they're running in the browser and there's some flag that you flip or it's only enabled for some set of websites, but that's not always the case. Things are getting shipped as they're being designed in the standards body absolutely is like a little bit begging the question right? And that if you find out there's a problem during review with the spec well, now a whole bunch of websites have already started depending on that certain functionality and have it baked in that is a real pickle and something that we fight with a lot during these reviews. But, yeah, that's, the less than ideal situation that things have come to at this point, I think it's getting better, but that's generally how things are done right now.Jeremy: [00:23:20] That's kind of interesting because it sounds like you have the W3C and you're planning on things that will go into the browser, but in order for them to become a standard, they need to already be in the browser. Which also means, like you said, that people are already using them. I wonder what is the negotiation or the back and forth in terms of, let's say Chrome is already using a certain feature and you say we'd like you to change this feature for this reason.They'll say well, we already have thousands of sites that are using this. how are we going to change this? What does that, that back and forth look like?Pete: [00:23:59] Yeah. So, the first thing to keep in mind is that the standards body is not a legal organization. The standards body can't make anybody do anything, they can say something's out of standard. They can remove it, but people listen to a standards body if they want to listen to a standards body and they don't, if they don't.So in that sense, a standards body works by trying to make it mutually beneficial for people to go along with it and providing resources that maybe an organization wouldn't have if they weren't in the standards body and some benefit in terms of interoperability, that sort of thing to strengthen the platform in general.So that being said, there's not an easy answer. Sometimes you can find clever games you can play with the way the existing APIs work that will reduce the amount of information exposed but without breaking the function signatures or the expected flow of the program. There's several different tiers of review in the W3C and some happen earlier than others.What we've been trying to do is push review earlier in the process to try to catch these things while they're still in prototype stage, instead of, web scale stage. But yeah, there's no way that the standards body can go to mr and mrs. Google and say, you must do a thing or, Mr. and Mrs. Apple or whatever else. Unfortunately that's just the case.Jeremy: [00:25:13] And you were saying, how you're the co-chair of the privacy interest group on the W3C. How much power would you say that that group has? Do you have examples of times where somebody has tried to push a feature through and you've rejected it on the basis of privacy and there's actually been changes made or the feature has been dropped all together?Pete: [00:25:39] I should say in terms of power we're a subset of the W3C and the W3C at the end of the day has, the organization has maybe some moral authority or some, you know, people respect it. And so there's some soft in quotation marks power there.And there's a lot of expertise that people respect in the W3C. And there's a lot of mutual interests between the browser vendors to have a web that is friendly for developers and friendly for users. So, there's no authority, but there is, web browsers are interested in sometimes what the W3C has because of these other reasons. So power is a funny word to use there through I take your point. To point out very specific changes that have been made. I'll talk about the WebRTC one the enumerate devices API, because that's the one we just mentioned before, so right now, partially because of interest among people in the working group, partially because of reviews that PING has done. PING is the privacy interest group. There's both immediate changes that look like they're going to go into the API.None of this is a hundred percent. This is all in the GitHub issues right now. But this is the direction it looks like things are going, best guess. That there'll be a number of ways of shrinking the amount of information that websites can access by default. So it's things like, without a permission, maybe you don't see this person has 18 different microphones.They just see there's at least one microphone. And then when the website asks for permission then they can learn about the number, the details, things like that. So that's a wonderful thing. I think the working group would agree that it's not the dream outcome, but it's dramatically better than what was there earlier.And then we're also working with the working group. They're doing great work in terms of figuring out like where can we go to that's even better? And so that looks like it'll be the website doesn't see anything by default. If the website wants to see devices, they call a thing, the browser prompts you with a list of devices.And then if you want to, that information gets passed. that's a example that comes to my mind. There's a long list of if it's of interest to your listeners too. I could also point you to the endless list of privacy issues that we raised and the back and forth that happens on them there.But sometimes they're very large things like that. Sometimes they're very small things like you're leaking fingerprinting bits here and let's figure out a way to sort through that.Jeremy: [00:27:44] One of the things I find interesting about your position is you work for brave, which is a browser vendor, and you have Microsoft with Edge. You've got Google with Chrome, Apple with Safari, and Mozilla with Firefox. And I would imagine that all of these different companies, they all have their own goals.They all have their own things that they want. I wonder from your perspective, what are the kinds of roles that each of these companies play and where do they butt heads and where are they on the same page?Pete: [00:28:19] I want to say things that I'm only very confident about. All these organizations and particularly the people that are sitting in these committees and these working groups that represent these organizations have an interest in the web.They see there's something unique about the web that that is appealing and desirable and positive. That's not on other platforms. They may not a hundred percent agree on what those positive things are, but there's something that appeals to us about the web that doesn't exist on other platforms.And so there's mutual interest there. I also think that all these people care about privacy, they care about making sure things are accessible to people with different needs on the web. They care about making sure APIs work well for people who speak different languages and come from different backgrounds and these sorts of things.At the end of the day, people who choose to spend their time in these long meetings working with each other.. We have very similar interests and we're all pushing the same way. Where they differ is the prioritization of those interests. Brave is absolutely like, we think there's something super duper wrong, like kind of fundamentally wrong.Yeah. Maybe that's too strong, but the web has really gone sideways and the privacy violations are endemic and really horrible. And like intolerable. I think other people would say, yes, privacy violations are bad, but also we want to make sure that we don't break the ecosystem that exists to fund the web as it exists today.And so that's like privacy is just one among many different interests, including making sure advertising dollars self fund websites and things like that. And then I think there's other people who exist in other parts of that, on that spectrum and have different interests.So I think we're all pushing the web in the same direction, are interested in making sure it flourishes, but what flourishing means probably differs between different people in different organizations.Jeremy: [00:29:58] Something that sometimes comes up and maybe it's a little more front of mind because Apple's worldwide developer conference happened recently is that people have a perception of Safari not implementing a lot of features that other browser vendors either implement or want to implement. I think a lot of times they say that they're doing it in in the name of privacy. And on the other hand, you have developers who are saying, Oh, we want all of these different features because we want to be able to build a progressive web applications. We want to be able to build a websites that are similar to apps.And I wonder from your perspective, how do you balance these two goals?Pete: [00:30:40] So I think that's a really interesting example you brought up for a couple of reasons. I bet we're thinking about the same tweet that went around and the same people blowing off steam, and I can totally understand their frustrations. But I should say two things first before going into the guts of your question.One is that, most of the things are not standards they are proposals. And so, as much as the web community, we like to treat them as standards because they're implemented in the popular browser. They are not standards, nobody's agreed to them, blah, blah, blah. They are proposals. Second thing is that, I think there was 16 or 17 or 18 different things on that list.I don't remember the full thing, but I remember looking through it and thinking Brave takes the additional step of removing these things from chromium before we ship Brave. I am completely sympathetic to the idea in the vast majority of those cases, maybe all those cases, I just don't remember the full list that those are really privacy risking features.And the permissions models around them are not well-defined. They haven't been well reviewed and the risk is really significant. Look, Apple's got more money than anybody knows what to do with Apple. Apple's not, not implementing cause they're lazy. They may be pursuing a different strategy.But I also know that the people in those committees have sincere strong, heartfelt interest in privacy. So I understand the frustration of the web community, but I find the privacy story there compelling.Jeremy: [00:31:58] And I think it's also maybe important to think about the fact that as soon as you put those into the browsers, it's going to be extremely difficult to remove them.Pete: [00:32:08] Yeah. I mean the web congeals around any features that gets there. And the moment you put something in it becomes extremely difficult to pull it out. Something that we deal with at Brave a lot, because we think that the way a lot of APIs work is inappropriate and intolerable and we have to be very clever in the kinds of ways we can modify behavior that websites already expect to exist in a certain way.Jeremy: [00:32:32] I think I know about the tweet you're referring to, and I don't remember all the specific features but, I wonder from your perspective, are these features that you think shouldn't exist or is it more that the way that people want to implement them now wouldn't be done in a privacy conscious way.Pete: [00:32:50] Hmm, that's a good question. So I also don't remember the full list, but I can pull off some examples. I think there's kind of three tiers. Some things just seemed like bad ideas that we should just not do, or at least not do without pretty fundamentally rethinking how they exist.Some of these things are things because they make more sense as operating system features or native app features than they do websites. And some of these things, are things that, yeah maybe those would actually be very useful on the web. If we could figure out how to do them responsibly.A lot of this stuff has its roots not in things that typical websites need to do, but like the union of a bunch of weird things that happened. One is like Firefox OS happened for awhile. And so a bunch of things got pushed into the web platform some of which got yanked out later, ChromeOS is another one, PWAs, things like this. And a lot of these things are really different from what we think about as websites. It's worth thinking about where are those lines and should they be firm or that sort of thing. A while ago, the example that sticks out of my head is there was a standard that got shipped in Gecko in Firefox and Chrome that allows websites to read the amount of ambient light in the room. The website could read it's very bright, it's very low, I don't remember the granularity, but any step in between. And of course the very first place this stuff got used was in tracking scripts to fingerprint people. Same with the battery API, there was an API that allowed websites to say, it's a full battery, low battery, that sort of thing.You can imagine why that'd be a nice feature in an app. But you can also imagine it gets sucked into the fingerprinting scripts immediately and starts harming and targeting people. And so yeah there's definitely a part of the web that says let's just permission prompt everything, or use a number of different kinds of proposals that concerned me to restrict this stuff or allow it on the web in a responsible way. The web as it is without adding more functionality has so many deep privacy issues. I feel very nervous about pushing for new functionality unless privacy is really treated as a first class citizen in those standards.Jeremy: [00:34:56] Yeah. And it sounds like where we're at now.. There's already so many different ways that you can be fingerprinted. And every time a new feature is added to the browser, it just gets more and more easy to track someone. Pete: [00:35:11] Yup. I think that's exactly right. And there's also cases where adding a new feature undoes a privacy protection somebody else has added in somewhere else. It's good to be very cautious before throwing new powerful features into the platform.Jeremy: [00:35:23] Another thing that you had mentioned when we first talked about doing this interview was you had said that Brave is based on chromium. And you said that you had a somewhat semi adversarial relationship with upstream chromium. I wonder if you could elaborate on that?Pete: [00:35:44] Sure that was kind of a silly in a goofball thing that way to put it. That's that misstates things too strongly. The chromium developers have been very receptive to questions that we have. And we've tried to upstream stuff we found to be a positive experience. But there are things where the vast majority of Chrome developers are Google employees. And of course Chromium is shipped in a lot of ways with Chrome in mind. I don't think it's a malicious thing, but it is the case and so there's a whole lot of stuff in the chromium code base that assumes Google. Which servers get talked to and account information stuff, and safe browsing things, and an incredibly long list of stuff that is just in the chromium code base but assumes Google including, and this is maybe this is what I had in mind when I said adversarial. Poor choice of words. There's a couple of features that Chrome ships that allow you to basically enable a feature only on certain origins and they call it field trials, this kind of thing.So if the chromium folks want to test out any feature they can say only these three or four or five or whatever partner websites can use it. Sometimes that feature gets shipped, they'll ship a feature ungated, not flagged. And then they'll use this feature to turn it off.So they'll ship some new experimental feature and then they'll say, but we're not gonna allow it on any sites. The field trials is zero or is empty. And so that's their way of making sure that sites don't get it. Well, if you're building a browser that wants to put firm lines between itself and Google data collection servers you don't get that information.And so now all of a sudden, the weirdo experimental feature is enabled globally in Chrome or in your version of Chromium. A long list of things like that. There are also other choices in the platform that makes certain things that we would like to do difficult. I could go on those examples if it's of interest. I don't think that's adversarial that was a silly choice of words. But it does mean that there's different interests being pursued in the code base that are not always Brave's. It's not always as privacy focused as Brave, would like.Jeremy: [00:37:40] I'm not sure if you would have an answer to this, but, when Brave was deciding, what rendering engine to use whether that's, Chromium's Blink or WebKit, or something else. Why, why make the decision to use chromium as a base?Pete: [00:37:57] So this predates me at the company so I can only think through some of these things. I don't want to say something I'm not sure about. The early Brave folks considered a bunch of different engines and Brave started as an Electron app. So basically when there is an extremely small number of developers at the company and it's extremely early days it was just everything was done on top of stock chromium. It allowed the company to iterate really quickly and try a bunch of new things and do some of the kinds of things that it knew it wanted to do that were easier to do at that level. Then trying to maintain a large patch set in this kind of stuff against Chromium.And there's probably some path dependency on that. We're no longer an electron app. We're a proper chromium project. That's part of it. I don't know the particulars of why Electron was selected and not a Gecko option or not a WebKit option. I couldn't say exactly what tipped the scale on one versus the other.Jeremy: [00:38:46] Something you mentioned was that private mode or incognito might be something interesting to talk about so could you elaborate on what you were thinking there?Pete: [00:38:55] Like the battle of what like private browsing mode is and the incognito mode is and what that is supposed to do is I think nobody has a single story for what it actually is supposed to be.In some browsers that basically means your storage doesn't persist after you close the browser. And that's all it means. The browser operates exactly the same way. Local storage operates the same way, et cetera, et cetera, except you have a separate cookie jar and a separate set of state that goes away when you close all your private browsing windows.That was for a long time the textbook definition or the whatever was agreed on. But you can see over time in standards bodies and in implementations.. I think there's been some recognition that users have a different understanding, or at least some users have a different expectation of what private means.And it can connotate something beyond just the state goes away. And so there's been a slow drip of new features. New privacy features into private browsing windows and major browsers. So Firefox by default if you enable a private browsing window you're in strict versus default mode for intelligent tracking protection, it does slightly different things.Chrome changes the operation of some APIs that allow you to query your quota on storage to prevent sites from detecting whether you're in private browsing mode, et cetera, et cetera, things like this. But I think it's interesting because it seems like a recognition that users want more privacy in a machine and are desperate for whatever buttons are in front of them.Even if what guarantees are being made by those buttons aren't totally clear.Jeremy: [00:40:26] Yeah, that's a good point because when I think of private mode or incognito mode, I think of your first example where it just means that it's going to clear whatever was stored on the computer like cookies or your history, things like that. And what you're saying is that now the opinions have shifted to where maybe private mode should be blocking trackers or maybe it should be... I think the example you gave was preventing sites from finding out certain things about your computer or your browser. That's a perspective that I didn't realize people thought but that makes a lot of sense.Pete: [00:41:05] And maybe this is a positive thing. It's become a little bit, eh, I'm not sure that's true. My impression I wouldn't go to war over it is, is that it's a little bit of testing ground for people to say, we know less people use private browsing mode than the typical mode.So we can be slightly more experimental in the kinds of features we test out in private browsing mode for privacy related features. If that's the case, then it means more and more stuff gets turned on by default over the medium term. I think it's probably a good thing for the webJeremy: [00:41:34] One of the things that you had touched on earlier was when you're trying to preserve privacy, when you have features that are blocking certain things that could be used to track you or block certain features in the browser, one of the side effects of that is that it can break websites.What are some, some common examples of where that can happen and how are, you know, you have brave, but browser vendors in general trying to to work around that?Pete: [00:42:04] Sure. So the most common or goofball way that can happen is say, you're using some ad blockers you're pulling in some filter list and it says you should delete everything that says, Ad in the URL or whatever, right. /ad/ something like that. Some website for whatever reason has something that's not an ad in a URL or something like that.Right now you're blocking something you don't intend. And there might be a script the page depends on for its execution. And given that the size of these filter lists, given that you could easily be considering hundreds of thousands, maybe even 200,000 rules if you're using a tool like Brave or uBlock Origin or something like that. The possibility for false positives is very high. So that's the simplest case that can happen. But then it gets more complicated. Brave by default blocks third party storage by default, there's a very extremely small number of exceptions that we make to unbreak websites.But by default we just block all third party storage. So if you're in an iframe, you don't get to store stuff, you don't get cookies. If you're a third party request, stuff like that. And, the vast majority of cases that works just fine.People don't usually care about the stuff that's going on in iframes on a page and when they do it doesn't usually need to touch storage, but you can imagine some places that'll break. Someone embeds a video and that video wants to store it's state or something like that.That requires some cleverness in dealing with it. And then just a third example. Like when COVID started becoming a popular concern is that people want to look at maps and see where COVID was spreading. And so these sites would usually use things like, either rendering these maps via SVG or be a canvas operation, and brave, by default did no, no longer, but at the time was blocking certain canvas operations and SVG operations because we knew they were being used by fingerprinters. And all three of those cases have privacy protections that ended up breaking things that at least in these cases are privacy harming.Probably even more so than my job doing privacy stuff at Brave is figuring out how to do that privacy stuff in a web compatible way or how to break less websites so people can use Brave without having to drop shields and drop those protections. And so each of those different things warrants a different response.So one has been to adopt a strategy that the uBlock Origin project takes. The uBlock Origin project is fantastic and all credit to those folks. That project, it is really fantastic work.Instead of just guessing, yes, I allow the resource or no, I block it. They'll also sometimes say replace it with some different thing that maintains the API signatures, but it actually nulls out the tracking behavior. And so that's been a really useful approach for unbreaking websites. If we can figure out what they expect, like the functions they expect to be in place but replace them with less painful stuff. And I can talk about our research project if it's of interest over the summer, actually with the student, Michael Smith, the student who's visiting from UCSD to leverage this, if that's of interest afterwards, Jeremy: [00:44:56] Are you replacing something in the JavaScript code that's running or are you replacing something that some browser API that is trying to get access to?Pete: [00:45:06] Sometimes, sometimes both. So in the simplest cases like Google Analytics provides some functions or like triggered some events on load. And if you block Google analytics, it means some things will never load. And so instead of blocking Google analytics, You just, you say, here's the request for Google analytics.Instead, I'm going to turn this thing that does nothing but trigger a load of it, but actually it doesn't touch network or anything like that. And so you're replacing the resource instead of requesting it. But you might also do things like, I see that some code that does something nasty or is inline so I don't get a chance to modify the request. I see it's inlined, so I want to somehow modify its behavior. And so I'm going to.. I mean, sometimes this stuff gets really gross, but I'm going to say overwrite some structure, the page expects to be there. I'm going to throw a stack trace. I'm going to look up and see if I'm in the inline code.If I am, I'm going to take path a and otherwise I'm going to take path B all these kinds of gross things. The web is a messy place and there's a whole bunch of tricks like that, that have to get pulled. So we pull a bunch of that stuff from the uBlock Origin project, we generate some of our own, for fingerprinting stuff.And this is something we've been able to pull from research that I'm really proud of us shipping, or I'm really glad about is in the same sort of way that uBlock Origin said it shouldn't just be yes or no. We should have some middle road that allows us to be more clever.We've taken the same approach of fingerprinting protection. So instead of just saying yes, it's allowed or no, the API goes away. We now do something we call farbling where we break the assumption that websites have that.. That the features are going to operate in a fixed way across browsers by adding a little bit of noise to the API response.So, if you're doing some canvas operations we'll with very low probability modify a pixel here or there or flip a bit like the lowest bit in the color channel for a pixel, that kind of thing. So instead of just blocking the API to protect people, we can instead have this more web compatible way where we still all the APIs do work, but we remove its identifiability by having it always do something different between sites, between sessions.Something that we're working on right now. And we actually are working with a student from North Carolina, who's prototyping this for us over the summer. This is another research intern named Jordan Stock who's doing great stuff. We're looking into a third option for local storage for remote storage.So instead of a frame, either yes. getting storage or no, not getting storage. We want this middle option where we can say the frame gets what looks like normal storage for the execution of the page. But by the time the top page for the top frame is closed, then that storage goes away. A lot of this stuff is just figuring out ways like the web compatibility game is, is figuring out a bunch of ways of breaking the binary choice and figuring out ways of sneaking more cleverness into the platform.Jeremy: [00:47:43] So when you're referring to a frame and the local storage going away could you kind of elaborate what you mean by that?Pete: [00:47:50] Oh, sure. So I'm on a website, like a typical website. you have your one frame, which is just this document object. And there's a bunch of like DOM structure that hangs off of that. But, one of those things off it might be an iframe, which is itself like its own contained document structure.And that can be infinitely recursed or infinite. It can happen infinitely deep. And so, this is usually referred to as the first party and the third party. Or the first like the local frame and the remote frame. There's some overloading of terms. Because, yeah, some browsers like remote frames are also remote processes, in the way that an operating system understands.But, typically yeah, the local frame is a frame that has the same, ETLD plus one, which means effective top level domain plus one, which is like the level of domain that you can register if you go to hover or whatever. And so all the frames that have the same ETLD plus one it's the top frame or local frames, anything else is a remote frame or a third party frame.And so browsers will use this as like a.. some browsers will use this as a heuristic for saying local frames the user trusts. And so I'm gonna allow it to store cookies and local storage and this kind of thing. Remote frames deserve less trust. And so I'm going to block storage or I'm going to partition storage or I'm going to do something possibly clever with storage, not all browsers do that, but it's a increasingly common.Jeremy: [00:49:05] I see, and I think you were explaining how you could have let's say an embedded iframe and it could use browser local storage, but maybe as soon as you click to another page, then that local storage goes away. Is that kind of what you were...Pete: [00:49:22] Yeah. So that's the approach Brave is taking. So there's another privacy group in the W3C called the privacy community group, which is, kind of like the sibling group to the group I co-chair. So I co-chair the review group that reads everybody else's specs and tries to improve the privacy of what other organizations or what other working groups are working on. Privacy CG is where browser vendors go to introduce new features. And so brave is involved in both. There's a lot of overlap between the two. Jeremy: [00:49:50] earlier you were talking about how people could be fingerprinted. They could be identified by seeing how things render, whether that's on a canvas or SVG and what you were saying, the way that you were dealing with it, which I found was interesting is it sounded like you were adding additional information. So your video card might render something a certain way, but then you would add additional things that would make it render differently than the video card normally would. And that's how you would remove that as an identifying factor. I wonder also, you were mentioning about how you had a research project at UCSD and I didn't quite catch what, what exactly that was.Pete: [00:50:35] Yeah. So in that order, the first one, the adding information parts, this, this approach came out of two research papers a while back. One is a paper called Privacator led by my current boss Ben Livshits who's a professor at Imperial in London. And the second paper was a paper called FP random fingerprint random. Both of those things introduced this technique or played with it. Brave is the first one who's productized it or included it in a popular shipping browser. But yeah the approach is to break the assumption that there's something unique about this browser that I can identify across sites.And so we randomize some of these features or we add an extremely subtle amount of noise that'll confuse fingerprinters, but look indistinguishable to users. We do it in a way that's random, but deterministic under each first party under each session. So you close the browser, you get a new fingerprint and if you go to a new site, you get a fingerprint.And so that prevents things from being linked. So it's been a nice way of taking academic research and figuring out a way to use it for a shipping, privacy protection.Jeremy: [00:51:36] Cool. So that was something that, in your role at brave or I guess brave as a company decided that this was something to look into from a research perspective. And then because the research went well, you're able to move that over to the product side.Pete: [00:51:50] Oh, well, I wish that was the case. I mean, so these are papers that preexisted at brave. It was a situation where we knew we had a problem. Most sites were breaking because of our fingerprinting protections. We didn't want to leave people less protected. And so research was one place we could start digging for a solution. So you asked about the project at UCSD this summer. There's a student who's visiting, Michael Smith, he's a fantastic student and a fantastic hacker. And so his project is, I mentioned before about the way uBO, uBlock Origin does these resources replacements. And so as you might imagine, these things are very difficult to generate. They take a lot of stepping into the debugger and manually figuring out how these large JavaScript blobs operate, particularly what, like what subset of the functionality you need to maintain to unbreak the pages. Extremely tedious and doesn't scale, it doesn't scale well. And so the approach that Michael and I are working on, Michael's doing the hard work. Is to see if we can automatically generate these things through a combination of browser instrumentation, a system we call page graph, which allows you to deterministically offline see the interaction of different elements of a page, AST analysis... AST is the abstract syntax tree or it's, it's a parsing step in, in executing JavaScript, or, parsing any language and then code rewriting to identify the parts that are privacy harming, rewrite, just those parts. And then we can programmatically generate these, these privacy preserving resource replacements in a way that can be automated instead of requiring the heroic amount of manual intervention that they currently do.Jeremy: [00:53:23] So if I understand correctly currently when you use something like uBlock Origin and you go to a website and let's say that website loads a script that has privacy implications, has some issues with tracking, but the behavior is still needed for that website to work. uBlock Origin will replace parts of the JavaScript source code so that the site still works. But it blocks whatever kind of tracking behavior that it was going to have. Is that correct?Pete: [00:53:53] Yeah, it's not that it fetches the resource and then does some rewriting on the fly. It just preloads like, this is the privacy preserving version of the Google analytics script, this kind of thing. Brave does the same thing. By the way, we had someone who worked with an intern last summer, Anton, who's now a full time employee at Brave and is phenomenal.But yeah, brave does the exact same thing out of the box. So we preload all the same resource replacements and are generating our own and do this in the same way.Jeremy: [00:54:23] And then in the research project that you're currently working on, the goal is for, the browser to be able to load these third party scripts and on the fly figure out if there's something, that should be blocked or changed in the script is, did I get that right?Pete: [00:54:41] That's mostly it, so it's slightly different than that. And the reason it's different is that, so JavaScript because it's so dynamic, it's difficult to statically analyze. You have to execute it and see what it actually does in a lot of cases to deal with all sorts of corner cases or all sorts of aspects of the language, because things could get aliased and functions get bound and there's dynamic code execution through evals and stuff like that. And so the difficulty there is you hand me some Javascript. I can't reason about it in a fundamental way about saying these are the seven places to where it's going to write a cookie or do a network request or touch your local storage or whatever.So that's one problem there. And the way we solve that is we have this heavily modified version of Brave that we call page graph, or that includes a feature we call page graph that allows us to among other things say, okay, these are the 18 parts of the JavaScript code that actually ended up touching local storage or doing a network request or whatever else.And so we use that for de-aliasing the values of JavaScript then offline we can, once we have those, we can programmatically rewrite the code by analyzing where those places are and replacing those lines of code or those chunks of the file with privacy preserving alternatives.And then at that point, we have our resource replacement automatically. So the process is offline and that we call the web and we will generate a whole bunch of these things beforehand that we can preload them in brave browser in the brave browser, or share them with the uBlock Origin project or anybody else.But the appealing thing is if we do all this work over the summer and this research project is successful, which I think will be, we have a way of automatically doing the stuff that before it would take an extreme, a pretty awesome amount of manual, labor to do.Jeremy: [00:56:28] And so it sounds like you have this special version of the brave browser and you could automate it to visit a bunch of websites. Pull all the scripts. And see what it does to the page. And then basically give you a list of, Hey, these are all the scripts that we think have issues.And we saw what it did, and this is the part we need to remove or change. And then you can ship that to users, either in uBlock Origin or in the brave browser itself.Pete: [00:56:57] Yeah. That's exactly right. And so most of that stuff already exists through fantastic tools that people like Google have made, puppeteer is a really fantastic system that Google has made that allows you to automate browsers and interact with sites and understand what browsers are doing.I mean, it's phenomenal, but it also doesn't answer all your questions. It's very difficult using puppeteer or using any system to understand this script modified this file. And that file then requested this image, and that image, whatever, you know, these complicated chains of interaction.That's extremely difficult to understand online in puppeteer or particularly after the fact just looking at the end result of the page. And so page graph is this system that allows us to with extremely high fidelity trace every single one of these operations in the page and then stitch them together in a graph in the sense of like edges and notes, not as a PDF.Jeremy: [00:57:49] Yeah, I think that's really interesting because I know one of your other, papers or presentations you have talked about Easy List, which is the list of trackers and ads that uBlock Origin and a lot of other systems use to decide what to block. And that sounds like a very time intensive process of you have all these different people that are visiting sites themselves and figuring out like, Oh, these are the things that should be blocked.Whereas with your research now, it would be more like we could have just the computer go and browse the internet for us. Figure out what needs to be blocked and save a lot of time in the future in terms of figuring out what we need to block.Pete: [00:58:33] Yeah, I think that's true. Although two complications there. And first I want to say that.. So EasyList is a fantastic project in there's a bunch of unrelated child projects. So there's EasyList. There's EasyPrivacy. There's a bunch of regional region specific EasyLists, this sort of thing. and, one the core maintainers of EasyList, who goes by the online handle Fanboy, is part of our team at Brave.He's fantastic. He's a full time brave employee and his job at brave is to maintain filter lists both for Brave, but also to benefit the larger community. And so things like Easy Lists are on one hand, phenomenal, like I think people just completely under appreciate how much there's four core maintainers of EasyList. And without these four people doing the things they do the web would be an infinitely more miserable place. And like the fact that like the web hangs on the evening, like the after hours jobs that these people have, until, at least until recently when they started being supported commercially is totally under appreciated and fantastic.Those lists are also deeply imperfect they're full of heuristics. Like the ad example, like /ad/ examples. There's lots of heuristics, there's lots of stuff that gets broken. And there's lots of, in quotation marks, dead weight or rules that were useful five years ago, but now it's very difficult to know if they're still useful given the size of the web.And so, it tends to just amass rules over time. None of that is the criticism of the maintainers who are fantastic or the community around them that contribute lists, but just the nature of the beast. Brave's approach. And some other researcher's approach has been to can we use these labels that these people have generated as high confidence things to start reasoning about the rest of the web.So it wouldn't be a replacement, EasyList. You still need some human in the, in the cycle somewhere to make some of these assessments, but can we like force multiply what that person is able to do through automation or machinery or, machine learning or, you know, different types of, of, of tooling.Jeremy: [01:00:24] Yeah, that makes a lot of sense. I thought it was really, surprising how few people were maintaining such a gigantic list. Like I think, you had said there were something like 300,000 entries, or I don't remember how many entries were on easy list, but itPete: [01:00:41] I think it's around 75 or I haven't looked recently. I know that Ryan's been doing some cleanup, but close to a hundred thousand in just EasyList. I think it's 70 something. And then there's EasyPrivacy and there's, you know, a long number of other lists too. So yeah, I couldn't tell you that the concatenated size, but large, very, very large.Jeremy: [01:00:58] So you've been centered on the privacy side and the tracking side. And I wonder in your work, if you had any visibility on the people who kind of want all these things to happen, like the advertisers that want to be able to do the tracking, has this sort of tracking actually been really effective for them. And on the flip side, I wonder how much of these ads are even being seen by real people? Could there be ad fraud going on in terms of computers are just looking at these ads? And we're not the ones looking at these ads.Pete: [01:01:37] Yeah. We're now stepping something out of my area and quadruple ironic quotation marks expertise. But I can, I can only share what I know, or my impression from working, doing what I do. Which is that yeah, absolutely. Fraud is completely endemic.To the degree that people have no idea how much it is, but numbers that I've seen up for online ad fraud are anywhere from 10 to 50%. These are not numbers. You shouldn't hold me down to, but, but simply to understand the magnitude of the problem, like enormous and in the number of like middle players, that are in these markets, make it extremely difficult for any one party to understand what's going on. There's a phrase that gets thrown around called the Lumascape. And after this call, I can try to find you an image, but it's, it's this kind of like 18 step deep, like flow chart of how advertising markets worked. And that was five or six or seven years ago when that image was made.But yeah, these they're extremely dense. And the vast majority of players in the markets you wouldn't recognize their name. Nobody would recognize their name unless you were an employee of that company. So, yeah, ad fraud is an enormous problem. And it doesn't seem like there's a way to, it's going to get better anytime soon.Like this system seems like it's definitely on its way out and kind of getting worse. One thing that's really neat about Brave is that a number of the people who work at Brave have like histories in ad tech markets playfully have said they're repenting for... their work at Brave is their apology for what they did earlier.One person who works BizDev.. Luke is like just a phenomenal dude and incredible at what he does, but he used to work in doing this kind of stuff, in terms of like helping to build tracking systems and understanding how they work and now, yeah, Luke, Luke is fantastic. Johnny Ryan is somebody who does policy work at brave, too.He used to work at PageFair. He talks a lot to enforcers, like people on the political side, who do like, CPAA and, GDPR kind of things to make sure that regulators are actually enforcing these things. And his sense is that just the amount of fraud and the amount of tracking is, is, is just unimaginable.And so, and so, yeah, the problem is, is well established. In terms of whether it's actually profitable I'm sure that's like very deeply debated. So I know that Google has some numbers that have said that if you remove like the behavioral component from tracking, you need to do just contextual tracking, so, or contextual ads.So ads that know like where they appear, but not who their who's looking at them. Their numbers suggest that like the profits dropped by like 50% something along those lines. I don't remember the exact numbers, but, something on that magnitude I know some people are extremely skeptical of those numbers. And of course has Google is not an unbiased actor, but those are the numbers that they've shared.And I know that these numbers on the other hand, that, that get pointed at you that says, The amount gained by, marketers in the, and people who are placing ads, is negligible negligible to negative, when you removed the, behavioral compo component, because, there's so much fraud in the market that they ended up, like behavioral tracking actually ends up having a negative return.So, so all that is to say is I, I, I deeply don't know. I know that the system relies on things that seem abhorrant to me, but, But there's a diversity of opinions. whether it's actually useful or useful for what it claims to do.Jeremy: [01:04:45] From site to site, the sort of effectiveness in terms of the ads, you see how relevant they are, it can vary really wildly right. And, and we're never really sure Why certain ads are being shown to us. Right. you know, the example that a lot of people will give is, on Amazon, right?Where you buy something and then all of the suggested items are like, for the thing you bought and people kind of joke like, Oh, you know, this targeting isn't very good. But on the other hand, you have platforms like Instagram, where I've heard that the advertisements on there they're actually very effective.They tend to show people things that they actually might be interested in buying and they actually go through and click. But it's interesting because like I was saying, I don't know why some things seem effective and why some things don't. It could be that they have tons of tracking information and they still do a bad job of what they show to you.Pete: [01:05:44] I had the same uncertainty about this stuff. I imagine that, I mean, I shouldn't hazard a guess. I honestly don't know, the usefulness of these things, I'm, I'm really dubious or I'm really uncertain about it. I doubt it, but I couldn't say confidently that it's definitely not the case.And I've heard the same kind of success stories and the same kind of, you know, catastrophe stories too. Two things here. One is that there's, This might be of interest. There's this kind of famous story of not success of tracking, but the harm of it. I can send you a link to the story if it's of interest, but there's a famous case of a family getting advertisements from target.These are paper advertisements from target. So the family starts getting advertisements sent to them for prenatal kind of stuff, cribs this kind of thing. And the father doesn't understand why this is happening and the parents don't understand what's happening.It turns out that the daughter is pregnant, and has been looking up information about how to take care for the expected child. And advertisers knew it before the rest of the family knew it. Anyway. So, I guess that's a story where maybe it was effective, but also morally reprehensible.And then the other thing I wanted to say is like, so this is maybe a chance to describe how brave does this differently than everybody else does. I think, one thing that I think is neat about brave is that brave does two things differently. One is that. There is no track, no information about you, your browsing ever leaves the device.And so this has two benefits. One is that your device is gonna have a lot more information about you than any third party is because it sees every website you visit. So, it can do a better job of understanding what might actually be useful to you. And second is that Brave lifts the incentive structure.So right now here on the web, the vast majority of ads are not going to be of interest to you. And all the ads come with this, like all these horrible side effects of hurting your performance. Violating your privacy, carrying the risk of malware, et cetera, et cetera, et cetera. And so nobody wants to look at it that's why ad blockers are popular.Brave's approach is different. We'll pay you to look at ads like brave incentivizes you to look at ads that gives you a reason to look at ads. It gives marketers a reason to prioritize your attention. It breaks the privacy and performance harm and security risk.And arguably can provide much better ads than, than some tracking based third party does. So I think there's something clever about what Brendan and Brian Bondy came up with. in terms of the way that brave goes about these things compared to how other marketers have.Jeremy: [01:08:05] We've been talking about how advertisers are using tracking to, to hopefully show you something that they think you'll be interested in, right. A lot of the research you're doing is to, to try and prevent a lot of that tracking.So if you do that and you show someone banner ads, how are you going to be able to ensure that those, those ads are relevant to the person, when you can't track them?Pete: [01:08:29] Two things, one is that brave will never, Brave, never puts an ad in the page. Like whenever you see an ad through Brave, it's very clearly not related to the page. It's in a notification to make sure that we are not putting ads against publishers who don't want them and for a whole bunch of other reasons, to prevent that kind of like, like brand confusion and all that other kind of nasty side effects about it. So brave doesn't track you in the sense that like your information never leaves your device. Brave if we wanted to, we couldn't learn anything about our users in any capacity like that.No bits hit our servers that describe your browsing behavior. But like that you're on the device that the advice is constantly learning and saying, Oh, it looks like you're looking at shoes. Looks like you're looking at cryptocurrency. Looks like you're looking at, you know, airline flights or whatever it may be.And so the device has a great deal of information that might be able to say, maybe you would like to see an ad about shoes, or maybe you would like to see it, an ad about, you know, vacations or whatever it may be. And so it's not tracking in the sense that like, nobody's looking at you.It's your own device, seeing what your device already sees. but it does have the kinds of information that seems like it might be able to, actually show you stuff that you might want to see it. The other thing is you mentioned, users understanding why they're getting the ads they're getting and to be able to control it.I mean, I think this is like a totally underappreciated concern in almost all of machine learning, where you have these extremely complex, deeply nested structures that are arrive at decisions that are completely opaque even to like machine learning experts, let alone to typical internet users.People's lives are being guided by these unauditable black boxes. Brave's commitment is we are committed to allowing people to understand and to see the model to edit the model, to partition the model, to add into, or remove certain interests.We're not at a threshold yet where, it makes sense to do that, but it is a commitment that Brave has made. It is absolutely in the plans and like, yeah, I mean, black boxes, like that terrify me and Brave is not going to become one of them.Jeremy: [01:10:29] And you had mentioned how Brave the browser is not going to add ads to a site that doesn't have them. Does that mean that for sites that will have ads, that they would have some relationship with brave where they say that we want to show, ads in brave and that's what has to happen in order for advertising to show?Pete: [01:10:50] Right now there's a couple different types of ads that Brave sends, the main one is there's notification ads. So, by default you see zero ads. You don't see anything, but if you say like, yes, I want to start getting paid to look at ads. You can say, show me, between one and five ads an hour and every, you know, one to five times an hour, you'll get a notification that looks like (fix transcript).But say notification, you get, if you received an email or whatever, and it'll say maybe you're interested in shoes or, you know, whatever it might say, that's the predominant way that you see ads in Brave products. You also sometimes see advertising and get compensated for ads. Like if you open up the new tab page and you haven't disabled it, you may see an ad there and that you similarly get compensated for that.I should say that for brave ads, the user gets 70% brave gets 30%. So it's like the inverse of the Apple app store. Right? Then there's the third tier of ad that brave considered, but does not ship and is working through the details on it. If we do ship it called publisher ads and that's when a website could affirmatively say yes, brave, please add ads in these locations on my website, Those, we don't do that now. if we ever did do it, it would be only with like the affirmative consent of the website.But there's a bunch of difficulties there that have kept us from shipping. It mostly like privacy concerns of, we don't want the ad, the site hosting the ad to be able to learn about the user based on the ad that Brave places in the site. Like that would be a way of just really enabling a lot of the same tracking that's happening right now.So we do not do publisher ads right now. We are thinking through ways that we might be able to do it in the future in a privacy preserving way. But right now the only ads that gets shown are notifications, new tab page.Jeremy: [01:12:28] I see. So, the new tab page would be something very similar to when you create a new tab in Firefox and they have like a list of suggested sites. Something like that.Pete: [01:12:39] Yeah. So right now, Like Chipotle advertised with Brave for a while. And so I think it was like one out of three times you opened up the new tab page. It would have a, you know, an image of a delicious burrito or whatever in the lower right hand corner, it would say Chipotle or whatever, like attractive images.It's not executing code. It's not doing animations. It looks attractive. That kind of thing. But it's just an image. And if you don't like it, you can turn it off. But if you like it, then brave will pay you to look at it.Jeremy: [01:13:01] Interesting. Yeah, it sort of reminds me a little bit of, back in the past, there were desktop applications that people could install and, I think they paid you. I don't remember if it was to click on the ads or, or just to see the ads. and this sort of sounds like a bit of, a modern kind of version of that.Pete: [01:13:20] I think that's true. Although, I mean, a bunch of things distinguish it. One is that, like the bonzi buddies of the world, like ended up becoming malware vectors, two is that they didn't have the kind of information that would be useful to actually like send you the kinds of ads you were interested in.They just pulled from a stock catalog. they were extremely obtrusive. I'm not aware of any ones that I paid you X like actual money or like a significant amount of money. I mean, but they might've existed. I couldn't say that they don't. One of the things that I want to say though, that I think is exciting to me about the Brave model is there is a sincere, honest question about like, how does content on the web get funded? And like advertising is not the only, but it's a significant part of how it content on the, on the web is funded currently.Brave's approach is different, right now, if you enable ads by default, that money goes to the websites you visit. I have, those are configured to show me five bites an hour. And at the end of every month, brave keeps track on the browser, not on the server, but the browser keeps track of these are the, you know, this is the distribution of your viewing time across the sites that you visit.And if you don't by default, brave, will just send your ad earnings to those sites. The sites that are involved like that are, are verified sites. They get revenue very similar to, or if not greater than, than the revenue that they would get for you looking at an ad that was an iframe on their page, but without the privacy harm, without all the nasty side effects.So I think this can be a really powerful way of funding the open web, but without all the horrible stuff that's comes, comes with it currently.Jeremy: [01:14:49] Yeah. I mean, I think that what you're seeing with a lot of news publications and even just people doing blogs and things like that is a lot of people are moving towards a subscription model, right. Where you pay me five bucks a month and, you can see my articles. And I think what's, what's tricky is that. You know, the web is so is so broad, right? You visit so many different sites a day. And so it's hard to imagine paying a monthly fee for every single site you visit. And yes, I'll be interested to see, see how that kind of model works out in the future.Pete: [01:15:26] Yeah. And I should say too, that it's all in exploratory stages, but, but an idea that brave is considering and may prototype at some point is can we have some sort of like, if someone opts into the brave system, then, then brave can be the way that you just automatically pass through those paywalls.Using the cryptocurrency, you could pay it in a, in brave is a way of saying, I don't want to have a subscription to a million different sites. If I'm in brave, then I just automatically do these, these invisible microtransactions to fund the sites that I'm viewing. I think there's something compelling about that.Jeremy: [01:16:02] Yeah, for sure. everybody loves complaining about paywalls.Pete: [01:16:07] Yeah, no joke.Jeremy: [01:16:09] Cool. Well, I think that's a good place to start wrapping up. is there, is there anything else you think I should have asked or you wanted to mention.Pete: [01:16:16] Nothing else comes to mind. I think this has been really enjoyable. I, well, actually, well, I can say two things. One is that, if you're any of your listeners are interested in privacy and web standards, like it it's a forum that could absolutely use more voices and more people who, Yeah, a greater diversity of opinion than people who work at browser vendors or ad tech companies.And so if any of your listeners are interested in those sorts of things, I would encourage them to get involved. They can send me a message or they can just go to the issues themselves, but that would be fantastic. They have more people involved there. And the second one is, I imagine that a large portion of your listeners are people who write software for, for a living or, or who are considering careers in writing software for a living.And a little bit soapboxy but you, yeah, that's a powerful thing and a privileged position for many people. And I would, it's worth thinking really, really well through like the morality of the kinds of causes you're spending your nine to five, like supporting.Jeremy: [01:17:06] And where can people, if they want to see what you're working on currently, where can they check out what you're doing?Pete: [01:17:12] Ah, so I have a website called peteresnyder.com where I have my publications and my research interests. a lot of the publications I work on at braid get published at brave/research. I write pretty regularly for the brave blog about new privacy features that are coming out in brave.And I will. Be writing as an additional set of articles on the brave blog about, standards work in the direction that privacy interests in what standards also I'm on Twitter at PES10k.Jeremy: [01:17:41] Cool. I think you gave everyone a lot to think about in terms of privacy and in terms of what's going on in their browsers. So thank you so much for talking to me today.Pete: [01:17:50] Thank you very much, Jeremy. This has been super fun. I appreciate it.Jeremy: [01:17:53] Thank you for listening to my chat with Pete. You can get show notes and a transcript for this episode at softwaresessions.com. The music in this episode was by crystal Cola. If you enjoyed the show, make sure to tell someone else about it. all right, I'll see you next time.

Jul 29, 2020 • 1h 8min

Functional Programming in Enterprise Applications with Vladimir Khorikov

Vlad is a Pluralsight course creator and the author of Unit Testing: Principles, Practices, and Patterns.We discuss:ImmutabilityError handlingAvoiding nullPreventing invalid stateUpdating existing applicationsThis episode originally aired on Software Engineering Radio.Related Links@vkhorikovEnterprise CraftsmanshipIs Entity the same as Value Object?Combining ASP.NET Core validation attributes with Value ObjectsError handling: Exception or Result?Applying Functional Principles in C#F# for Fun and ProfitTranscriptYou can help edit this transcript on GitHub.Jeremy: [00:00:05] Hey, this is Jeremy Jung for Software Engineering Radio. Today. I'm talking with Vladimir Khorikov. Vladimir is the author of the book Unit Testing: Principles, Practices and Patterns he's a Microsoft MVP, and he's the author of many Pluralsight courses including Applying Functional Principles in C#, and today we're going to be talking about functional programming in enterprise applications. Vladimir welcome to Software Engineering Radio.Vladimir: [00:00:28] Thank you for having me.Jeremy: [00:00:29] The first thing I want to talk about is sort of what functional programming means to you, because it means different things to different people. To you, what are the core principles of functional programming?Vladimir: [00:00:41] If I were to describe functional programming in just a couple of words, I would say that functional programming is programming without hidden inputs and outputs. And that's basically it. And what I mean by hidden inputs and outputs is there are several examples of those. So the most, prevalent example of a hidden output is immutability.So let's say that you have a method that takes in some integer and then increments that integer by one. And what it can do is it can return that incremented integer back as the return value, but. It can also mutate some global state with that integer. And by the way, by hidden, I mean that this information is not present in the method's signature.So it's not present in the method arguments, and it is not present in their methods, return value. So in this example, when you, when you have this increment method if it, returns. A value, then it communicates what it does, pretty clearly. So it is honest about what it does. But if it instead mutates, global state, then this would be an example of a hidden output because that output is not present in the map at signature.And to understand what this method does, you have to drill down to that method and to see what. What's actually going on, because it can this information is not present, in the, in the signature itself. So that would be an example of a hidden output. The hidden input is a similar to that. So instead of taking that integer as a parameter, as an argument. This method can also refer to some global state. So, for example, some static property or field, or it can refer to some external service to request that in integer and then incremented and then put in to some global state. So that would be an example of a hidden output.Also reaching out to external systems such as the database or APIs would be that and also, the simple DateTime.Now would also be an example of hidden input because that input always changes. It basically refers to the systems low level, API, your, for example, windows API or Linux API to, to get, this input value so that's another example of, of a hidden input. Another example of a hidden output would be something like exceptions. So exceptions are hidden output because, when you throw an exception, and that exception is caught somewhere upper, the call stack. This exemption also is not present in the method's signature.And, you basically introduce another, Hidden pathway, for your, program flow that is not present in the method signature. So these are common examples of hidden inputs and outputs and functional programming is about avoiding those hidden inputs and outputs. If your mathematical function, your pure function would be something that, accepts a value, as an argument and returns a value and doesn't do anything other than that.Jeremy: [00:04:06] Okay, so let's sort of break down a few of those. So. The first thing is hidden outputs in terms of if you pass something into a function and the thing that you pass in becomes changed, then that in effect is a, hidden output because, there is no way of telling just from the method signature whether that behavior is possible.Vladimir: [00:04:30] Correct. Yes.Jeremy: [00:04:31] And so you're saying that the alternative to that is to make sure that the function is pure so that when you pass something in, if you are going to make a change, it would not be to what you passed in, but it would be something that you're returning back.Vladimir: [00:04:50] Yeah. So instead of mutating the state of some existing object, what you need to do instead, in functional programming is you need to create a new object with the required properties. So instead of mutating the object that is passed in, you need to create a new object with new properties and return it back.And with example, with a number increment that that's basically it, because when you increment the number by one and return it back, you're not, Changing the input parameter because it's a constant, you cannot change it. What you do instead is you create another number and return it back.Jeremy: [00:05:26] And when we think about objects that we pass in, or we think about collections. a lot of times the objects and collections we work with are mutable, right? Like, we can have a list type in C sharp, for example, and we may want to add something to the list or remove something to the list. If we are instead creating a new list every time we want to change something, what are the performance implications of that?Vladimir: [00:05:55] Well, yeah, definitely. So there are trade offs to functional programming. And one of the most common tradeoffs to, any immutability is this trade off of, Always creating new objects instead of mutating new ones. And, that's actually the reason why object oriented programming, has become so, so popular in the past.Because if, you know, functional programming actually was introduced before object oriented programming. but why it didn't take off is because, computers back then were not as powerful as now. And so it was very costly to do functional programming with all those memory allocations, with all those new object creation.So it was very costly to do so. And what we had had instead is we started to operate at the lower right level, of our programs. And we started to think of, , in terms of, memory allocations. So, but now we're kind of getting back to the roots and starting to, to do more of what we did back then.There are definitely trade offs here. And if your performance requirements for your application are strict, then there are some restrictions. So there are some limitations and probably you will not be able to implement some functional programming approaches.But in most business line applications, that's not something you need to worry about. So if you write some framework, for example, an ASP.NET Application not application, but the server itself, Kestrel, then you do need to worry about that. But in most enterprise level applications you don't, so performance is not the biggest concern.The biggest concern is usually the complexity and the uncontrollable growth of complexity. And, what functional programming allows you to do is it allows you to reduce that complexity at the expense of maybe not as performant code as you could have otherwise.Jeremy: [00:07:56] So would you say that in the average application that the developer should default to making things immutable, is that a reasonable default for most developers?Vladimir: [00:08:09] I would say so, yes. If it is possible, then you definitely need to default to creating immutable classes, immutable objects by default, it is not always possible and one of the limitations why is in object oriented languages it's pretty hard to create new objects based on the existing objects. So, for example, if you take F#, there is a really nice language feature where you, where you can take data structure or an object, and, create a new object based on that, existing object, but with the mutation, of, with the addition of new properties to that object. So you can say, for example, old object with, some property equals a new value and some other proper property equals other new value.And what it will do it will not mutate the existing object, but it will create a new object with those two properties changed and then, but all the old properties they will remain the same. And this is a really nice feature that helps you to default to immutability. Unfortunately, in object oriented languages like C#, we don't have such features and so it's not always feasible to do so.What I recommend you do instead is if you have some value, Or a simple class, you can wrap it into a value object, which is immutable. but for, for all other classes such as entities, like for example, a user or a company, that is usually something that you need to mutate, that is an example of an entity.It has its own, inherent identity, such classes usually it's usually not feasible to make them immutable, but what you can do is you can keep them mutable, but put as much logic as possible to those immutable value objects. And this way you can keep the separation between complexity and immutability.So your objects will be either mutable or complex, but not both. You keep the separation between the complexity and, mutability because it is, when you combine the two, you start to have, these problems with the, the ever increasing complexity with unmanageable complexity.Jeremy: [00:10:28] And you were referring to the concept of entities and value objects. Is that right?Vladimir: [00:10:34] Value object, yes.Jeremy: [00:10:36] And so what, what is the distinction between those two? You were saying that a value object should be immutable and an entity could be mutable. Like how do you decide which is which is, which?Vladimir: [00:10:48] Yeah. these two concepts are, from the domain driven design, but they are actually applicable to an application in which even if you don't follow, domain driven design principles, it, it is handy to refer to your objects in this way anyway. The main distinction between them is that an entity is something that is trackable, by your application has an internal, inherent identity.An example I often give is, let's say you have a dollar bill, so you have money class, and this money represents a dollar bill. So this dollar bill would be a value object in most systems because in most cases, you don't really care if you have the exact same dollar bill as you had before.So let's say if you give someone a $1 bill and they give you back a also $1 bill. You don't care if it was the same piece of paper as before because for you, they are interchangeable. And that is one of the most important properties of value objects. They are interchangeable, but that also depends on the domain model so on your context, for example, if you create a system that tracks those dollar bills, dollar notes, then in this case, the, those bills and those pieces of paper, they become entities because you do care about each individual a piece of paper. so you, for example, you can have the number on that dollar bill as the ID of your entity, and then you track, where it goes throughout its lifetime.So it becomes an entity because it starts to have its own identity, its own identity, and you cannot exchange $1 dollar bill for another one because you do care about the history.Jeremy: [00:12:42] Does this mean that pretty much anything that's an entity would also exist in some kind of permanent store, like a database?Vladimir: [00:12:50] Yes, exactly. That's because you as I said you care about the history of that entity and what it usually manifests in as you need to persist that entity in the database. And then look after the changes in this entity.Jeremy: [00:13:05] And you were kind of referring to the fact that these entities could act as wrappers to value objects. So I want to kind of give an example of, let's say I have a a ride sharing company, and I'm keeping track of my, my fleet of vehicles. So I have cars that are driving around the city and they're all reporting back to me, their position.And each of my cars has an ID which parts of my, data would be an entity and which part would be a value object?Vladimir: [00:13:38] Yeah. That's a good example of where you can apply this entity value, object separation. So the car itself would be an entity because you need to track it. That's, the primary indication that it is an entity. It has an ID, and the position itself would be a value object because you can replace it with another object, with the same content and two positions of the same type and content, they are interchangeable for you. So yeah, that's a good example.Jeremy: [00:14:08] And so the individual updates, they would have the ID of the vehicle and the actual position. That could be a value object. And as I'm receiving multiple updates from each of my vehicles, it's reusing the same ID. And, in my database, I might want to keep a history of, you know, all the locations that my car went.So with those historical positions, would those be their own entity, or what would those be?Vladimir: [00:14:41] So the positions themselves would not be their own entity in this case, what would be an entity is the historical record. So in this case, you wouldn't have the position itself as an entity with an ID, but you would still have some, let's say, vehicle history record or something like that, as another entity that would, also contain the position as a value object.So, you will have this kind of nesting here where you still have the same value object, but the entity would be a different entity.Jeremy: [00:15:15] I see. And so the reason that we have the value object is that it sort of tells our system that this concept of a location is identical, whether it's in the context of being an update, I'm getting from a vehicle versus, a historical position. They're really both the same concept, so I can use the same value object for both.Vladimir: [00:15:40] Yes, exactly. it is much easier to reuse this position value object between these two concepts. And also, another reason why you wouldn't want to introduce a value object for these two positions is because it is much easier to keep consistency, in this way. So let's say you have other properties in your vehicle other than the position itself.So yeah. Let's say vehicle has, a license plate. And let's, let's say it has two numbers of its license plate, and it also has, two properties that display the position of the vehicle. So X coordinate. And. Y coordinate let's say. Why you would want to introduce value objects is to reduce complexity. Because when you have, those four properties as just four properties, then the number of permutations between them is higher. Then it would if you group, those properties into separate value objects.So if you group the two properties that belong to the license plate into a value object, and you also group, the two coordinates into the position value object, then you will have only two properties inside that vehicle. And the number of permutations between them is much lower. It's just two, whereas the number of fermentation between the four objects is going to be well, mu much larger, much larger number (laughs).So, that's a good way to think of complexity of your system when you reduce, the number of properties that you need to keep in mind your software automatically becomes much simpler because it becomes much easier to keep the consistency because, for example, when you accept a position, what you need to do is you need to just, check the correctness of that position on its own without It's connection to other properties of the vehicle and the same for the license plate, you don't need to validate that license plate against the, the position coordinates. So yeah, the validation becomes simpler and just maintenance overall becomes much simpler too.Jeremy: [00:17:46] And so it sounds like rather than having vehicle update type, for example, and making that object be responsible for it, the validations of the license plate and the validations of the position information. Instead, you break out those concepts into their own objects so that those objects, when you create them, they validate whether or not it's valid.And so as long as you can successfully create one. You pass it into the constructor for the vehicle update entity and you, you know that they're correct because, they were already validated before you passed them in.Vladimir: [00:18:26] Yes, exactly, and that is one of the benefits of functional programming is that when you keep your objects immutable, your value objects immutable, you only need to validate those objects once. When you create them after that, you are, you can be sure that those objects remain in the consistent, in the correct state, so you don't need to validate them afterwards.Jeremy: [00:18:49] Do you find that in development code wise, is it easier to reason about what's happening when you're creating these objects and then creating the entity and then passing these objects into the entity versus having a larger constructor for the entity?Vladimir: [00:19:08] Well, it depends on the use case. So, yeah, usually what I like to do is I like to, Do it hierarchically. So you first create, lower level objects such as, as you said, value objects. Then if you have some other value object that consists of those lower level of value objects, you create that value object and then you pass that you, you kind of create that structure where you go bottom up from the lower level objects to the higher level objects, and you create them sequentially one by one, and on the top at the top of that pyramid, you have the entity itself, which you can create just by passing those already validated value objects into that entities constructor.And you don't need to do much else. So the entity itself becomes a much simpler to maintain.Jeremy: [00:20:01] And these examples we've been giving the value objects, they've been able to validate themselves. Like for example, the position, there's only a fixed set of numbers that are valid for the position, and so we could validate that without talking to an external store or a database or anything like that.How about when you have a case where. To see if something is valid. You need to talk to an external API or talk to a database. Like for example, if there was some kind of, driver registration or like a license that's associated with the car and you needed to talk to, some kind of. State API or, city information to find out if that car update is valid, where, where would that exist in your application?Vladimir: [00:20:50] Yeah. So there is kind of debate into where you should put this logic. I strongly recommend not putting it inside your domain model because your domain model shouldn't talk to any external systems. So you should make that domain model as functionally pure as possible. Let me step back for a second and, describe what I mean by that.So what you want to do in your application, especially if you try to, adhere to functional programming, you don't actually want to make all your code immutable because that's usually impossible. when you create an application, you do want that application to mutate some state because otherwise that application would be useless.For example, if you create a user or if you update the vehicle information, you do want to, you do want that information to be eventually updated. So the vehicle record and the database, it should be updated. What do you need to strive for is the so-called functional architecture. And this architecture is, where you have some sort of a sandwich between, where you first gather the information required for the business operation.Then you delegate, that, information. So you pass that information. And delegate the work to the domain model. And then when the domain model, completes its work, you persist the results of that work to the database. And so what you can do here is you can make your domain model as purely functional as possible, and this way you kind of push the mutable operations to the edges of your system, to the edges of your business operation.Because as I said, you cannot, skip those mutable operations altogether, but you can kind of, work around them. and with this approach, with this functional architecture, what you achieve is you achieve the separation between, the two important concerns. So the domain model is responsible for your, domain logic. And, I like to call it, immutable and the rest of your system becomes mutable shell. So it is a dumb shell that is responsible for communicating with external systems. primarily. And so, the two responsibilities here are the domain modeling and orchestration and you don't want to mix them together because, when mixed together, they overcomplicate your code.You know, you don't, you don't want to do that because, the code becomes much harder to maintain. So, yeah, so this is the functional architecture, this kind of sandwich where the beginning at the top of your business operation, it talks to external systems.Then it passes control to your domain model, which is as purely functional as possible. And then at the bottom, all the decisions made by the domain model and communicates it to external systems, including your database. In some cases, the flow of your domain logic itself depends on what kind of response you get from external systems. So in your, in your example, this response would be whether or not this vehicle. With this, number already exists in the database. And if it does, you cannot register a new vehicle with the same number.What you can do is you can delegate some decisions to the controller, for which you have to reach out to external systems such as the database because the alternative to that one would be to delegate this responsibility for communication with the database to the domain model itself. And this way you will keep the controller simple, but you will make the domain model impure. And that is, in my opinion, a worse option. even though you kind of keep the controller simple and you kind of keep all the domain logic inside the domain model, it's still not the best approach because you want to separate that domain model from this responsibility of communicating with external systems because, when, these two responsibilities are combined together, that's, when you start to have an over complicated code base.Jeremy: [00:25:09] If I understand correctly, it sounds like anything that has to do with external systems, even if it's a part of validation, those calls should be made at the outside layer. Like for example, in your application controller or in an object that's receiving messages from an external system.And, that is where you would make the call to check in your database to see if, for example, if this was an ID that already existed. Or if, you know, we needed to talk to some citie's registration system to see if this vehicle is licensed. We would do all of that outside of the, the value objects. We would do that more in our controller or in some area outside of all of the internal domain objects.Vladimir: [00:25:59] Yes, that's correct. So, your, domain model you cannot delegate this responsibility of maintaining consistency that spans across the whole database. So you only need to delegate to our domain model. the consistency requirements that span, the objects themselves. So ideally it should stay within a single entity or, in domain driven design terms an aggregate an aggregate is a group of related entities.If you're consistency requirements spans more than one aggregate or one entity, then yes, it's usually, it should be attributed to the controller itself. So, and, they check for uniqueness, let's say user, email, uniqueness or vehicle number uniqueness. That's an example of such a validation. Yes.Jeremy: [00:26:49] Let's say you have a value object that is a, vehicle and before we checked with the database, we didn't know if it was a valid new vehicle or not because we didn't know if it was already existing or if it's truly new. And at the controller we make a request to the database and we find out that this vehicle doesn't already exist yet, so it's a valid request to create a new vehicle.Would we then need to create a new type that says like, this is a valid, vehicle creation request, because if we hadn't done that check yet, then we, we wouldn't actually know if it was valid yet. I don't know if that makes sense?Vladimir: [00:27:32] Yeah, it does. so what you're talking about is you want to have, let's say an idempotent request that would either create a new vehicle or update an existing one if it already exists. Correct.Jeremy: [00:27:46] Sure. Yeah.Vladimir: [00:27:47] Yeah. I wouldn't create a new request, for that. So, if it is a requirement, if it is a business requirement to do that in one go, which it sounds like it is here, then I would just attribute this logic to the controller.So first check, if this vehicle with this ID exists, and then, go from there, either update its position. using this value object or create a new vehicle using the same value object. So, in this case, you will work with the same position with your object anyway, because you will use it for creation and for update of the vehicle position. And so it's just a matter of, what to do next when you see if the vehicle exists or not.Jeremy: [00:28:36] And so after we've completed this validation and we've created our value object, then when we persist it to our database, that would also be in the controller. Is that correct?Vladimir: [00:28:47] Yes. So the creation of the vehicle. That part would be the domain logic part. So if we are talking about the sandwich architecture, again, the functional architecture, then at the first, the top part of the sandwich would be reaching out to the database to see the vehicle exists.The middle part for the main model part would be the creation of new vehicle or, the update of the, the update of the existing vehicle. And then the bottom part would be a saving that vehicle to the database. Yes.Jeremy: [00:29:22] And I want to kind of step back to, at the start of our conversation, you were talking about hidden inputs and hidden outputs. And one of the things that you were talking about as being a hidden output is, a exception. For example, like if there's an error that occurs. You may not know, that an exception is something that can return, Walk us through like why that's an issue or if that is an issue.Vladimir: [00:29:53] The issue here is not the exception per se. But how you use those exception . So if you use those exceptions for validation, then it does become an issue. what I'd like to see with regards to exceptions is that exceptions are for exceptional situations only. And what I mean by that is situations that you did not expect to happen in your application.Validation is by definition, an expected, situation. Because you do expect your clients or your users to enter incorrect values or send you incorrect data and so on. And so when you validate that data, you do expect it to be sometimes at least to be incorrect. And so, what you can do is you can, for example, validate that data and then if it is incorrect, you can throw an exception saying that this data is incorrect. But, it has the drawback of, of this hidden output that I mentioned earlier where you create another another path in the program flow that is not, that is not present in the method signature.So let's say if you have a method a void method, that accepts some data and then throws an exception, you, you cannot be sure what it does, so you cannot be sure how you react to these exceptions, because it could be that these exceptions are caught at the method that now call this validate method, but it could also be several layers upward the call stack, and so it becomes much harder to debug this application and much harder to understand what it does.Whereas with something like a result type, what you could do is you can explicitly return the result of the validation from that method and this, this way, you will make the signature honest. So an honest signature is something I like to also refer to a purely. Purely functional method would have an honest signature because it will tell you explicitly what it does, what it, what inputs it has and what outputs it produces. The drawback with the result type is that you often need to write more code, because without them, you can basically just, make several validations, call several validate methods, and then if something is wrong, you will catch an exemption in the catch part of your application. But with the result, you have to, process each output separately. So you have, the first result, which you need to process and the second one and so on. And also another issue here is that, it becomes, at least in object oriented programming languages like C#. It becomes simpler to, omit that result because there is nothing preventing you to just forget about it and ignore it. And yeah, that, that could be an issue. And that is one of the trade offs of this functional programming approach. but again, I think, the benefits of this approach, overweight the costs because you're becoming much more explicit. In the values, those methods for return.And it is much more maintainable in the long run. And by the way, this issue with, the ignorable outputs, it is only present in OOP languages, like C#. For example. If you take F#, you cannot just ignore an output, which is none, which is non void or unit in F# terms, you have to pipe it into special function, ignore.So you can always see that you are ignoring something because you're seeing that, the return value is piped into that, ignore method.Jeremy: [00:33:44] For those who aren't familiar with result types, can you kind of give a brief explanation of what they are.Vladimir: [00:33:51] Yeah, sure. So a result is an explicit representation of some operation. So let's say, when you perform some operation, let's say you're trying to, save. Something in an external system. let's say that you want to update your profile on Facebook and you're using a Facebook API for that.And, so you're, you're doing some API calls to the Facebook API and this operation, obviously may fail. And one of the ways to to deal with that failure is, as I said before, is to use exceptions, but that's not the best way to deal with that because I, it is not often obvious where exactly you process those failures.A better way would be to catch those potential exemptions from that API at the lowest level possible, and then wrap them into an explicit structure such as a result class and the result class is a simple structure that tells you whether or not this operation succeeded. So it has. such fields as is it a failure or is it a success? And if the operation was supposed to return a value, then you can also make this result, class generic. And if it is successful, it can also contain a value, that will contain the result of the operation.Jeremy: [00:35:12] The benefit of using the result type is that your function signatures, they become very explicit in telling you that this call that you're going to make, it could succeed or it could fail. And this result type is going to tell you, whether it succeeds or fails and that way, you know, to write code to, to account for both cases.Vladimir: [00:35:33] Exactly. Yes.Jeremy: [00:35:34] One of the things about exceptions is that when an exception occurs, there's a lot of information that's kind of embedded with it generally a full stack trace, for example, whereas with a result type, you may not have any additional information on why something failed. how do you deal with that or, or are there cases where you would say it does make sense to use an exception instead?Vladimir: [00:35:59] Yeah. I'm not saying that you shouldn't use exceptions at all because there are use cases for them, and one of those use cases is, well, actually the only use case is where you have an, a situation that you cannot, you don't know how to deal with. And for example. if that Facebook API returns an error that you didn't expect.So let's say that when you wrote the software, you expected some set of return values, return errors, some set of errors from the Facebook. Let's say that the user doesn't exist, but if it returns some other error, some obscure error, you don't necessarily know how to deal with that error.And in this case, because you don't know how to deal with that, it is preferable to throw an exception. This would be an example of an unexpected situation for which exceptions are preferable. because exceptions represent unexpected situations in code, you shouldn't catch them.You should only catch them at the topmost level of your call stack. And the only way, you should react to them is to just log, what exactly happened. And then, basically crash the application. Or if it's some background process, you need to just restart all over again.Because otherwise if you are if you're trying to continue working after this exception took place what you can run into is you can run into an inconsistent state where your program entered some, state where it. It's kind of still working, but, it may become inconsistent and even save some data into the database where it will become much harder to deal with because you want to avoid that inconsistency as much as possible and as soon as that inconsistency takes place, you want to stop everything, basically crash your application. And that would be the only use case for exceptions. And so in this case, you do still have the call stack, which you can then log somewhere and deal somewhere somehow later.But. If it is an expected situation, let's say that the Facebook API returned that this user doesn't exist, then you do know how to deal with that and you basically don't need the exception, stack because you can just, process this, error from the Facebook. Turn it into a result object, and then, return back to the caller and that caller then can in turn show some friendly error message to the user.Jeremy: [00:38:42] So, so basically when you're working with external API APIs, like the Facebook API, you may make a HTTP request, and maybe it times out and the HTTP library that's built into C#, I believe it would throw an exception. and what you're saying is that you would know ahead of time that I expect that there may be times where my request times out or it fails, I'm going to catch this exception and then I'm going to return a result type that kind of explains what the failure was, rather than, just throwing that exception and catching it somewhere else.Vladimir: [00:39:19] Yes, but that depends on whether or not you know how to deal with that, even if you expect some time out. Yeah, let's say that the Facebook API, calls to that API may time, time out from time to time. So you need to see whether or not you can deal with those errors. Because if you cannot, then even if you expect those situations to happen, you basically cannot do anything about them.And so you need to throw an exception anyway and that could be because let's say, the Facebook API is essential for this operation, and you cannot proceed without a response from Facebook. But if it is not essential, let's say if a user updates its profile and you want to update its Facebook profile as well simultaneously, but if you cannot do that, then still fine you can proceed further.So in this case, you can see you that this facebook call was a failure, but you know, that it is not essential for this business operation and you can just ignore that result and move on. And another example, let's say you write an ORM such as entity framework, and, in this ORM. The lack of connection to the database would be an exceptional situation because you cannot do anything about that.And you don't know, how the user of your library will react to that exception. And so in this case, because you as a library writer, you do not know how to deal with that exception. You also need to throw an exception and then the library or a user such as yourself or myself, we can decide whether or not this operation was essential for us and whether or not we can proceed, with that exemption or not.So let's say that when you're trying to save something to the database, it is preferable to do that. Say, for example, when you try to log something into the database. So it is preferable to that the log is successful, but if it's not, then not a big deal. And so in this case, you also need to, to catch and that exception that the library throws and then transform it into a result class and then process it, upper the call stack.But if it is essential for your application, let's say you're saving not a log entry, but the user itself, then even if you know that this ORM can throw an exception, you cannot do anything about that, and so you shouldn't process that exception. You should basically allow it to pop up to the upper layers where it will be logged and the application will crash.Jeremy: [00:41:58] Another thing that you sometimes talk about in the context of functional programming is this idea of how object oriented languages, they usually have a null, concept where. Instead of returning the object, you expect you return a null. And that could be because you couldn't find the element it could be any number of reasons. What are the drawbacks of returning a null?Vladimir: [00:42:25] Yeah, it's a common $1 billion mistake that all object oriented programming languages have in them. And that is the problem we have now is that, they make all your code dishonest because what, . What they do with your code. Let's say that you return a user from some method, and that user is a class.So in C# classes are nullable. all classes are nullable, so they, you can return not an object of that class, but now that would be a valid program from the compiler perspective. the problem with that is that you cannot differentiate between nullable user and the non-nullable users and where you can, see a method that returns that user.What it actually does, it returns a special, class, which you can call user or null because it may be either a user or null. And so when you want a non-nullable user, you, there is no way for you to do that. because all, as I said, all classes are nullable by default. and yeah, that's the problem because that introduces another hidden output that you cannot see just by looking at the method signature.Jeremy: [00:43:36] One of the things that you've, done as a way to kind of push back against that as this concept of a maybe or an optional type, could you explain what those are and how they're used?Vladimir: [00:43:47] Yep. I think what I did, in this course, yeah, I'm trying to remember, it was several years ago. Yeah. So what you can do instead is there is a good tool, it's called fody null guard that you can use to basically inject null checks in all your methods and properties. And what it will do is it will check all input arguments for nulls for you and also it will check output, return values for nulls as well. And it's a very good tool. I tried to use it, in as many of my projects as possible, but it's not always possible, let's say that. And what it does is it. Helps you to approximate your code to this world where nulls do not exist.So, if you try to return a null, where your method returns a user, your method will automatically throw an exception because of this automatic checks for nulls. And to avoid that, you will need to use a maybe or an option as in F#. So, and Maybe is a special struct that you can use to explicitly tell the clients of your code which parts of your, inputs or outputs can be null.And because it is a maybe it itself can not turn into a null because it is struct and structs in C sharp are not nullable. And it becomes sort of a nice trick, to avoid, these null problems. Because if you want to make your return value, your user nullable, you have to wrap it with a maybe of user and you cannot do that otherwise because if you try to return null without that Maybe your method will throw an exception.But if you, if you do use the Maybe, then your null will be automatically transformed into an instance of that Maybe type and your code will, will proceed. The validation here is not as strict as in functional programming languages because, this issue with the null it will not be caught at the compile time.But still it's close enough because, even though the compiler will see this code as valid, I mean the code where you return a user, but, return null instead, but it will still fail at runtime. So you will have sort of, close to functional guarantees. here as well.Jeremy: [00:46:17] Sounds like it's similar to the result type in the sense that with the result type we were saying we would wrap an object in a result, and what that would do to the method signature is it would say that this function you're going to call it could succeed or it could fail.And similarly, this Maybe type, it sounds like it's wrapping your object. It's wrapping your response and telling you that your response could have something in it or it could have nothing in it, and it's making it explicit as a part of the return type.Vladimir: [00:46:48] Yes, exactly. So, the, the maybe type it gives you the same benefits of the result type. So it makes the, method explicit. But in addition to that, if you use it, these Fodi null guard library, it also gives you some guarantees, some runtime guarantees that you will not actually have a null. Where you return a non Maybe user.So if you return just user, then you have this guarantee that it will actually be a non-null instance. Because if you try to return null, then the application will throw an exception.Jeremy: [00:47:21] Something that C# added recently in C# 8 is it added non nullable reference types. So it has a compiled type check, to see if you could possibly be using something that's null, is that a good substitute for this Maybe type or kind of what are your thoughts on that feature?Vladimir: [00:47:42] It's a nice move in to the right direction, but I don't think that it is a good enough substitute because those checks they will only give you compile warnings. So they are not compile errors, but that's not a big issue because you can turn those warnings into errors by setting up a couple of things in Visual Studio.The main issue here is that, it doesn't catch all those situations where you may have nulls. And so you still may have some issues with nulls even though, C# 8 will tell you that everything is fine. So that's basically my concern with that, that it's not as strict as they might be type.Jeremy: [00:48:27] It kind of gives you some protection, but there are some cases it doesn't catch. So it may give you this false sense of security.Vladimir: [00:48:36] Yes, exactly.Jeremy: [00:48:37] Another thing that you bring up in some of your courses is that. When data is coming in from an outside source, like let's say, you have a API and somebody sends you data via JSON, or you get data via a message queue, you tend to create a separate DTO, a data transfer object rather than use the entities or the value types that you've created internally.Why do you make the decision to do that?Vladimir: [00:49:11] Yeah, that's a very important thing to do in my opinion, because, you need to maintain the separation between data contracts and your domain model. And this is important because if you are using the same domain classes, and as, as these input data structures, then you may fall into several problems.So let's say that you you have a controller that is responsible for user creation and, one way to represent that data that clients send you when they try to register a new user. one way to do that would be to use the same user domain objects. Object as you have in the domain model.So let's say that it has a username or password and maybe some other properties that map one-to-one to the properties that the client sent you. The issue here, the first issue here is the security hole, potential security hole. Because, you may introduce some properties in the future to your domain class that you don't want the client to set. So let's say that you introduce a flag saying that this user is an admin. let's say it's a boolean isAdmin flag. And if you introduce it to your domain model, then, it, it becomes a potential security problem because now your clients can send this flag as well, and it will be deserialized into that domain model.And if you save it. as as-is into the database, then, you will basically create another admin in your system without knowing that. And so that's one another problem here is that. When you use your domain classes like that, you are setting those, domain classes in stone, so to speak, because you often need to maintain backwards compatibility with the clients.And what it means is you cannot refactor those classes as often as you may need. So let's say that, for example, you, your user has a name property and you want to split it into first name and last name. But because you want to maintain backwards compatibility, you can not just do that because the old clients of your application will break because they will not know that split they will not know about it and they will continue to send you just one name. And in this case, it becomes problematic because now you have to maintain sort of the old name property. But you also need to add the first name and the last name and then somehow correlated between the two maybe transform name into first and last name user using some rules.And you don't want to do that. Instead, what you need to do is you need to have a separate layer of data contracts. DTO data transfer objects where you have as many versions of those data contracts as you as you want. So if you decide to split the name into first and last names. You don't need to modify the old DTO.You can create a new endpoint that accepts a new. DTO, version two, let's say, that we'll have the first and the last name, and then you will do the conversion between these two end points. So the first endpoint that still has the first version of the DTO. You can do the conversion between your domain model and the old data structure there.And so in this way, you are free to modify to refactor your domain model without looking back to how how it makes incompatible or backward compatible changes for your clients. and so you, you sort of. Decouple that data contract from your internal domain model. You want the internal domain model to move as fast as you want, so you want to be able to refactor it, but you want to keep the data contracts backward compatible.Jeremy: [00:53:06] It's almost like the difference between when you have a class, you have a public interface, and then you have the private implementation. And when you. Use data transfer objects, you expose an interface that you want to keep the same, but you want to be able to modify how that's handled internally in your system.And so having these DTOs, it makes sure that you can make as many changes as you want internally in your system without affecting what your API looks like to the outside.Vladimir: [00:53:41] Yes, exactly. It's a good analogy. Yes.Jeremy: [00:53:43] The one thing I can think about as far as DTO is and converting them to internal domain objects. One of the things about that is it sounds like you could potentially have a lot of conversion code. How should you sort of plan for that and where should that exist in your application?Vladimir: [00:54:02] So, my view has evolved since that course. I think what I did is I created, extension methods on top of, result, and it's still a good way to do that. Let's say that, when you create a user, you need to validate, his first last name, let's say an email, and let's say a couple of other properties.And what you can end up with is a lot of code that does validation. So you are creating a value object first, and then you need to make sure that the user with the same email doesn't exist in the database, and then you need to create another value object and validate it . And so it creates a lot of, if else statements that clutter your code base.What you can do instead is you can follow the so-called railway oriented approach, which was introduced by Scott Wlaschin. and so what I did is I basically adopted this approach from F# to C#. and you can introduce extension methods that will drastically simplify all those if statements, It will help you to reduce the number of line codes by a factor of three, and without losing any readability. And for simple validations, it's still a good approach, but, there is a nice, way of dealing with validation in asp.net and that is validation attributes.And what I did in that course, I said that validation attributes is nice, but they kind of don't play well with value objects. And so if you want to really adhere to domain driven design principles or functional principles, then you need to switch from those annotations to this railway oriented approach.But you actually can combine the two. So you can combine the approach with annotations and still have this validation logic in Value Objects. So one of the biggest disadvantages of having those validations in annotations is that you are duplicating that validation. So you want to keep those validation rules inside your domain model because it is part of your domain model it is essential part of it. when you put let's say regular expression validation attributes on top of your DTOs you are duplicating those rules between the two parts of your system. So now you have a value object with the same rules and, and also, that same rule that exists in the data annotations.So to combine the two, you can actually create your own custom annotations that would delegate, those checks to the value objects, but still would work exactly the same way as the regular annotation attributes, meaning that you can declaratively put them on top of your DTO properties and they will work, very well.So you will still reduce the, number of validation code lies lines drastically, but you will still keep this nice declarative approach that you had with annotations. And I have a blog post on my website, we can link it in show notes where I showed this approach in detail.Jeremy: [00:57:25] Just to make sure I understand correctly, so what you're describing is in ASP.NET. When you have a model for a DTO, you can put annotations on it. You can have your property and above the property, you could say something like, the max length is 50 so this person's name can't be more than 50 and what ASP.NET is able to do is if you were to create a form and you used that property in the form, if somebody typed in a name and they put in 80 characters, ASP.NET, using that annotation would be able to automatically, sort of create an error and you would be able to put that next to the field. And I think what you're saying that you can do is that you can keep those sort of validation rules inside the domain objects that you create, or I think you called them the value object, and you're able to still write an annotation that just refers to the validation that exists in your value object rather than using the builtin, data annotations.Vladimir: [00:58:37] Yes. Yes, exactly. And that's a nice way to combine the two because it sort of combines the best of the two worlds. You still have your validation rules in one place.Jeremy: [00:58:48] What's your approach to, when you have a code base, that has exceptions and it passes back nulls the calls to the database are sort of mixed in with the objects.Like how do you start that process of bringing in more functional concepts or just bringing in more concepts that are easier to follow and to understand?Vladimir: [00:59:09] Yeah, that's a great question. And yeah, it's a tough one it depends a lot on the specifics of that project on specifics of the team and the management. It's one thing if this project doesn't, evolve much and it's just some project in the maintenance mode where you don't need to introduce a lot of new features in this case, I actually don't recommend that you do much because it will be, it will most likely will not pay off in the long run.But if it is a project that is actively developed, then it's a different story, and in this case, you need to come up with some refactoring approach, some refactoring strategy, there are a couple of approaches here. In Domain Driven Design, for example, Eric Evans wrote a great piece where he talked about, so this approach that involves bubble contexts. And so a bubble context is something that you create inside a legacy code base that adheres to all the good principles. So you have a nice separation between the domain logic and the orchestration and your domain logic is ideally, purely functional and, because you, you cannot refactor the whole application at once.And I actually don't recommend that you rewrite your application either because it's, it's not a good idea in most cases. You still want to start somewhere. And where you can start is by creating these bubble contexts. Let's say that you have some new feature or you need to modify an existing feature, and this feature is somewhat not too connected to the other system.And so you can start to isolate this functionality into the bubble context and surround that bubble context with an anti-corruption layer and that anti-corruption layer, it's basically a repository that converts your good and clean domain model into the database with this messy legacy structure and converts back into your nice and clean domain model.And what you can do is you can start expanding that bubble context. You can gain territory, more and more with new features with new, refactorings. And eventually what you want to do is, come to this point where your bubble context becomes the main part of your application. And, it's the legacy part that is surrounded by the anti-corruption layer.This pattern is also called a strangler pattern where you strangle, these legacy part, and cut off slices of functionality from that part and transform it and refactor them, into your bubble context.You need to first define the building blocks of your domain model. And those building blocks are usually value objects. So the , easiest to create classes in your application, let's say as simple as an email value object or as simple as a customer name value object. And so, when you do that, you can put, domain logic that relates to those email and customer names to those value objects. start using these value objects from the rest of your system, and then, start, from there so you can build a hierarchy of objects. So let's say that you have another object that consist of, those smaller building blocks, smaller value objects. So you do that. And then, as I said previously, you can proceed to your entities and refactor that entity. So instead of separate properties on the entity, you can start to have, properties, defined as value objects. And so, you are attributing more and more logic from that entity to those value objects. And the entity itself becomes simpler. And then, from that level, you step even further and push the domain logic down from controllers to those entities. Because what you usually have in such legacy systems is the anemic domain model where your domain logic is separated from the domain data. So data is separate from the logic and that we can talk about it a bit too.But the main drawback in this system is that it's hard to maintain encapsulation. It's hard to maintain consistency, inside. the domain data, because it's separated from the logic that works upon that data. and the logic itself is usually like in something like services. And so you can push that logic from services down to entities.And so what you have it's sort of a cascade of a logic that you push further and further down. And the more down you can push it, the better because the easier it will be to work with. And the problem with the anemic domain model is that, well there is actually a nice, dichotomy between, anemic domain models and functional programming because, anemic domain model it's about separating data from functions or from operations that work upon that data. But functional programming is kind of the same. So it's also about separating data from operations that work upon that data. The big difference between the two is that in functional programming though, the data is immutable and it is a big deal because it's impossible to corrupt immutable data. So it's basically impossible to come up with something that you cannot change. but anemic domain models, although they exhibit similar properties to the functional approach. The biggest, difference is that it is mutable that data inside domain models is mutable and you can never know who mutates that data and how they do that. And so it becomes impossible to enforce restrictions on everyone whom mutated that data with, with such an environment.Jeremy: [01:05:18] Given all the things we've talked about, if people want to kind of see an example of a lot of these things in action, are there any code bases that they can take a look at that are open source or any good examples that you can point them to?Vladimir: [01:05:35] So, if we are talking about C-sharp, then I would recommend my Pluralsight course. it's called, applying functional principles in C or something like that. I actually have a trial code for Pluralsight side, so if you want just reach out to me. We can put my email address.So they will give you a, I think it's 30 days, unlimited access to Pluralsight, so you can watch all my courses and more during that time. Also if we're talking to F#, I would highly recommend Scott Wlaschin's books on this topic. So he. Has a great site it's called F# for fun and profit, and it has a section with books in it where one of the books is basically the collection of them articles from the site itself.But the other book is that it is about Domain Driven Design combined with the functional approach, and it's really great book. It, it explains how to do the Domain Driven Design in a functional programming language like F#.Jeremy: [01:06:38] And, where should people go if they want to see more about what you're working on and follow you?Vladimir: [01:06:45] The best place is to go to my website. It's called enterprise craftsmanship.com. And yeah, you will find all the links there.Jeremy: [01:06:55] Cool. Well, Vladimir thank you so much for coming on the show.Vladimir: [01:06:59] Thank you for having me. Jeremy: [01:07:00] I hope you enjoyed the conversation with Vlad. You can get the show notes and a transcript for this episode at softwaresessions.com. Alright see ya.

Jul 15, 2020 • 45min

Open Source Onboarding with Brian Douglas

Brian is a Senior Developer Advocate at GitHub and was previously a Developer Advocate at Netlify.We discuss:Unintentional gatekeeping Formal onboarding for your projectsThe value of discord communities for newcomersStreaming issue triage and programming on twitchHow Open Sauced helps developers get involved with open sourceRelated Links@bdougieYOOpen SaucedExpressJS Triager GuideGraphiQLBabel's contributing guide (Gives suggestions on familiarizing yourself with codebase)Webpack's funding pageSE Unlocked episode with Dan AbramovExplore GitHubChangelog NightlyCode TriageFigmaDiscordTwitch StreamsbdougieYOjlengstorfNoopkatMusic by Crystal Cola: 12:30 AM / OrionYou can help edit this transcript on GitHub.Jeremy: [00:00:00] This is Jeremy Jung and today I'm talking to Brian Douglas, he's a senior developer advocate at GitHub, the host of JAMstack radio and the creator of open sauced, an application to help new contributors to open source. Brian welcome to Software Sessions.Brian: [00:00:14] Hey Jeremy, thanks for having me on.Jeremy: [00:00:16] The first thing I want to get into is. What's the biggest barrier for people getting into open source?Brian: [00:00:23] Yeah, that's a good question. I think the barrier for open source is something I found or discovered right off the bat. I've been developing for over seven years now, seven years ish. And getting into open source can be daunting, especially if you don't know where to get started.So I think the biggest barrier is actually onboarding and it's just knowing is a CONTRIBUTING.md, the proper place to go to or is there some other secret channel somewhere or a Slack group or something else where you could actually get a relationship with the project? I think a lot of us leverage a lot of these tools that are open source, and go years of leveraging them without even knowing who's contributing to them, who's powering it.What community is involved in the project. So just knowing where to start is usually the hardest part. I think that we do a good job as a developer community. There are guides on how to contribute, like open a pull request, manage your commits and stuff like that.But there's no guide of how to say hello when it comes to giving your first open source contribution.Jeremy: [00:01:27] Not knowing where to start, even if there is a CONTRIBUTING.md and there's issues out there. People are like, I don't know which one to pick up. I don't know who to talk to first. It's just awkward, I guess.Brian: [00:01:38] Yeah. Each project is different. So there's no centralized CONTRIBUTING.md file everybody is sourcing from so where one project can be, could say, okay, CONTRIBUTING.md, git clone, git, check out a new branch, git push origin. And then that's it. And some of them don't even have contributing MDs.Some of them are just READMEs. Then you go to the README and there could be missing information. Some projects don't have READMEs. Some projects have READMEs and websites and documentation and Slack groups. So not knowing the balance of how to actually get involved in the project.And I think what it really comes down to is if I started a new job the first thing I'm gonna get is a step by step okay. Here's your laptop. Here's how to do this thing. Here's how to clone the repo. Here's who to talk to. A lot of projects don't have that. Like they don't have like area owners, plugin owners, who's on the review team. Who's on the triage team. How big is the contributing group? You can go into the GitHub repo and discover it all. But it'd be nice if someone just gave you a piece of paper or one file to get all that information. I think we've sort of grown out of the CONTRIBUTING.md and we need something else.Jeremy: [00:02:49] When you are looking at an open source project, there's all these different issues whether it's bugs or feature requests. And it can be hard to know which things are suited for your skill level. And what do you think is the solution for that for somebody trying to pick out that issue that would work for them.Brian: [00:03:08] Yeah. And I mean, the easy answer is they have labels like good first issue and like documentation. If some people don't know this, if you go to any GitHub repo, so like github.com/nodeJS/node. Or node/node, I think is what it is /contribute. So if you add /contribute on the URL you can see all the issues that are available for up for grabs and you can leverage them and jump into.I don't actually recommend doing all that first and going to labels. I think the very first step is actually talking to a person. So the quickest place that you can find communication synchronous of like, Hey, I'm looking to contribute. I've been using this thing for six months on this project. I just want to give back. I had this idea for a feature like open issue, like ask questions on the issue, or even like now we have a feature at GitHub called discussions. In addition to that go into the discussion, but also limit the amount of back and forth you have to do asynchronous. And just go directly to the source, which is the person who is on-call already to chat with you.A lot of projects have discords now. So find the discord link and then jump in there and say hello, because your experience is going to be completely different when you're actually talking to somebody and asking questions synchronously in discord.The chat scrolls so it doesn't matter if you say a random question or you ask a question that's been asked a hundred times. Someone will give you a link, but it's better to do that than to be the person on the issue asking the same question for the fourth time. Or asking the wrong question at the wrong time.I think that's a little daunting as well. If you don't know how the project, the underlying secret sauce of the project is actually laid out for you.Jeremy: [00:04:45] When I think about open source, I think about communication being asynchronous, right? Going through issues, emails, mailing lists, things like that. And you're saying if you're first starting out, actually the best thing would be to, to find more synchronous communication, find that discord room or gittir, or, whatever it is, where you can actually have a conversation with someone.Brian: [00:05:10] Yeah. And to be fair I'm catering this more towards a beginner opensource contributor. If you're experienced, do the regular thing, reference the issue, open up the PR, and no need to look for a synchronous communication if you know how to solve the problem.GitHub itself is like 50 million developers worldwide. There's not 50 million developers doing open source. Let's just be clear on that. So there's a big difference between the users on GitHub who are just shipping code, like normal building websites for their companies or mobile apps or whatever it is to the people actually contributing the code that's powering all that stuff and powering GitHub as well.I'm using this term called unintentional gatekeeping. I've been thinking about this a lot and I want to write a blog post on this because it's around the flow of information. So if I happen to be in the right Slack channel or the right discord, I have more information than the person who's not there.Because there's more information flowing through there than there are publicly on issues, because issues are treated as like a statement of work. You're declaring that this is the way it's working or declaring that it's broken and next steps to reproduce it or whatever it is.And same thing with PRs. Like you're declaring this is the work I've done. This is the next steps. You review it. It's very robotic. But when you have that relationship that's built in a Slack channel and like, this is similar if you go to meetups or if you happen to know somebody from college or high school or whatever it is very similar. That relationship is like a relationship that helps give you that extra edge.And I think when we talk about things like tech, there's definitely a lot of conversation about diversity, especially today. so when we have diversity of backgrounds, diversity of culture, where people coming from, you tend to find a lot of the, especially newer, smaller startups have a mono cultured diversity.And people are well aware but there are VCs who are telling us it doesn't matter when you first start out just build a product, get all your friends in the same room. Build only the culture fits and stuff like that. And then let's move on and then we'll figure it out when the company is like much bigger and it has like much bigger issues.So I say all this because it's opportunity for people to become quote unquote of the culture of the open source project by just being in a room like this being in the room and listening. And if you find out that maintainer or the contributors are not your cup of tea you can just move away and move to another project or fork the project and create a new project that's similar but has a different culture.Like, open source a lot of these things are MIT licenses, no limitation for you to try things out and maybe copy code and create your own project and see if there's growth in that approach. I don't recommend that. But if that's your approach, definitely try it out.Jeremy: [00:07:56] Another thing that I often hear when people talk about wanting to get into open source is they have trouble finding someone to mentor them or help them through the process. I wonder what are your thoughts on how we can improve that experience?Brian: [00:08:13] Yeah, this is more on the maintainer. Individuals who are managing the projects. I mentioned the onboarding experience. There's obviously opportunity for them to have better onboarding. Have some clear steps of what your expectations are for people to contribute to the project. Not just how to clone it and open a PR but more of like, how do you report an issue?Is there a template for reporting issues that can guide the person into actually asking the right question as opposed to free for all and then your issues turn into stack overflow which is not the best place to ask questions is GitHub issues. Like you could do that, but stack overflow it's an entire platform built for that reason. So how do you kick people from issues to stack overflow instead?We didn't talk about what sort of code, right. But, I do a lot of JavaScript and there's this one library called express JS. it just builds quick servers for your websites and web apps and ExpressJS actually did ship something really recently, I think back in April, they merged in this new quote unquote feature or a guide, which is called the triager guide. Are you familiar with this at all?Jeremy: [00:09:16] I'm not.Brian: [00:09:17] Yeah. So, basically what they're doing is they're instead of saying, Hey, We have a lot of issues go ahead and pick one up and like merge it. Or not even merge it just open the PR and we'll go back and forth for like weeks or months. And then we eventually merge it.Instead, they're saying if you want to intro into express, you don't have to know anything about express. You don't have to use express. We have this role called the triage role and it's literally a team in the org that you can join if you just raise your hand and your job is to triage issues.So if someone provides an issue. If they don't provide reproduction steps, you kick it back and say, Hey, can you provide reproduction steps? So if you don't know how to do it, then the maintainer probably won't know how to do it. Or maybe they do, but that's a lot of time for them.So joining like a triage role or having an opportunity to do that, to label issues, to mark things as ready for review or ready to contribute or whatever or good first issue or whatever it is like that's-- express has a lot of issues and there's a lot of time spent trying to figure out is this valid or is this not?They're actually taking help from the open source community, giving them a badge, which is the triage role in the project. So it shows up on their profile. That was great for prospective employers. Like, Hey, you're in the org, we're using express, you have access to the maintainers. Maybe we can get our features on there.That's eye opening and it's eye opening that I have not seen that at all until very recently. So me personally, for my project, I just launched a triage role. Cause I want people to be able to have an introduction into my project, which is a react app without needing to know react, like all you have to do is know how to answer questions or how to find information.If you don't, there's other people on the team that can help guide you. And we have a discord as well that can guide you to actually getting things shipped.Jeremy: [00:11:02] I've noticed when you watch people's livestreams for coding and who work on open source projects-- a lot of the time they spend is actually on issue triage. Is on looking through all these GitHub issues, figuring out which ones are valid and which ones are not. And so I think that's an interesting idea of getting people started there so that they get to see the process of open source without necessarily needing to jump straight into the technical details. That's an interesting path to get more involved.Brian: [00:11:36] Yeah, I like it. I'm curious of, what Twitch streamers you watch too as well cause I've been trying to collect the list of myself. But I like watching people do open source, actually. I think right now, Jason Lengstorf is doing some open source right now on his Twitch stream, I'll catch the VOD later.But, yeah, I think that's actually a good thing you brought up to you as well. Cause I've been doing some Twitch stream myself and trying to figure out what is the purpose for live coding on Twitch? Is it to give webinar type tutorials like screencasts, is it to interview like what we're doing on a podcast or do that as a Twitch stream and where I found my niche or what I like to do on Twitch streams is actually do exactly that triage issues.I'm actually gonna be live streaming later today. And I've been doing some sketch, some UI building. I'm not a designer. At all. But I took a course last fall and learned how to use sketch to build some UI templates to not have to rely on somebody else to actually get me across the goal line for our shipping projects.So I'm going to spend 90 minutes building out some UI, and actually trying figma for the first time too as well. Cause figma, it's sort of like the GitHub for designers. I'm not sure if that's the summary of their product. I don't work there, but, yeah. So basically I'll be doing figma.I'll be building some UIs and some wireframes. Just sort of figure out the next steps, cause I've got a backlog of features I want to add to my project, but I don't know how to tell people that this is how we're going to work on it. Cause I have a whole I think 17 contributors at this point I was going to say team, but they're contributors. We do have teams in squads or whatever. But yeah, so it's easier for me to get everything in my head and the vision for the project out onto like a UI. And let individuals know. But my question, I first asked, like, I'm curious, who are you watching on Twitch?Jeremy: [00:13:19] Yeah, so I just, dabble a little bit. One of the people that I find interesting is, Suz Hinton and sometimes it's like issue triage type stuff, but also sometimes she works on more hardware type projects, in sort of the intersection between working with JavaScript, but actually working with, physical hardware and actually wiring stuff up. I don't watch a lot of streams but things that are interesting for me is being able to see someone's thought process. Cause often when you watch a streamer they're talking through their process and what they're thinking and whether it's doing triage or whether it's working on a bug or a feature. You get to see how somebody works in a way that you wouldn't from a screencast. With a screen cast or a lecture they're very well prepared. They've been practicing, whatever they're trying to teach, whereas in a stream it's more this is something that they haven't done before, and you're just going through that process with them.Brian: [00:14:25] yeah, Suz Hinton, I guess. Is it nope cat or no op cat?Jeremy: [00:14:30] Yeah, noopkat, I think that's right. Yeah.Brian: [00:14:32] Yeah. Yeah. Not hearing the word said out loud. I say noop and nope. noop. I interchange it's likeJeremy: [00:14:38] They're probably all right.Brian: [00:14:41] Yeah. So I've actually watched Suz and I like her style and I like the projects that she works on too as well. I've yet to catch her live in a while. So I don't know if she stopped streaming. But yeah, similar to yourself I like seeing the thought process and the people walking through how to build things. Because at the moment a lot of us are working from home. Especially in the state of California. And we're not sitting next to our coworkers anymore and asking those questions.I think twitch has in the last three months it's sort of exploded with live coders, and live coders as in the general people who live stream and code at the same time. Because I think a lot of people will just figure it out, like, Hey, I need to have community and I'm not getting it through my team Slack channel. So it's been an interesting transition as far as like a whole other culture that's growing on Twitch at the moment.Jeremy: [00:15:35] For sure. And then for yourself as the person who is doing the streaming, what do you get out of it and what are you looking for?Brian: [00:15:43] Yeah, I mean, it just comes down to community. I started the stream, mainly because I wanted to have a place to start throwing my ideas out there for the project I'm working on right now, which is open sauce and I started streaming two years ago.I'd heard of people doing live coding on Twitch, but it wasn't very popular at all. Only a handful of people were doing it and I even talked to some people at Twitch about it. Some people who were familiar with this space and were knowledgeable and so I started doing it, but I didn't really have the proper equipment. I had my Mac and I was just streaming from my Mac. And nowadays you gotta have proper lighting and step up the game a bit and a green screen as well which I'm sort of sitting in front of it at the moment.I was doing that but I sort of fell off because my daughter was born just a couple months after that. So I took time off from work but also took time off from coding, just to enjoy some paternal leave. But anyway fast forward to very recently which is a couple of months ago I started streaming again focused on trying to build an open source project and just have a place to write code consistently cause my day job is developer advocate and I don't have any long standing projects that I work on a regular basis. A lot of stuff I work on it just sort of ships complete and then we don't touch it unless there's something broken. So once I get done with the actual shipping something, I just move on to the next thing. So there's nothing that I can feel proud of that I continue to work on. Or where I do like the latest and greatest things like Vue JS or whatever. I shipped Vue projects, but I move on and they just work until I have to do maintenance on it. So I wanted to have a consistent place where I could just talk about a story of a project that I was working on, which again, I keep mentioning it, which is open sauced. And since then I've actually built a community of quite a few developers interested in the same problem I'm trying to solve which is open source onboarding.Jeremy: [00:17:32] Let's get a little bit into open sauce. We were talking about a lot of different troubles that people have when they're trying to get into open source. How is open sauce trying to address those?Brian: [00:17:42] Yeah, you tee'd it up for me earlier with the whole trouble getting into open source, it's onboarding. So we're building a platform to provide structured onboarding for open source projects. So like me connecting with maintainers and projects to add a simple YAML file to the project.So that way anybody who navigates to the project on open sauce, can have a good, like step by step process of who to talk to, how to get involved in the project, where to go for the synchronous communication. The other thing is that tracking projects is something you can do on GitHub, but it's not really built for that. At least today, like hopefully GitHub gets on board. And puts open sauce or adopts a lot of the features that we're sort of building. But the goal is not to track projects you're already a part of or even track projects that that you're working on. It's more tracking projects that you want to quote unquote stalk. So like a GitHub star is a thing sort of like you hit a star, like, or whatever on a project. And it goes into like a list. And usually most people just forget about that list because you just add a star, like that's about it. It's like sort of Instagram. You just add a like, and you move on and that's what GitHub stars have become.So it's hard to track things that you're interested in based on stars. You could watch projects, but then when you start watching projects it becomes a basically signal to noise ratio. So then with a very popular project, you don't know what you're looking at. So then you fall off immediately, cause this is too much information or not enough information, one or the other.Yeah, so basically there's not a lot of tools. So I think a couple of years ago, actually, when we met a couple of years ago, I had just shipped a little side project for myself, which is essentially bookmarking issues. Issues that I wanted to work on GitHub. So my day job is at GitHub. I've actually seen internally this feature be built a couple of times, but we sort of backed away from it cause it didn't really solve the right problem. And so as soon as I joined GitHub, I was like, Oh, maybe I won't work on this thing anymore. So I took a bit of a break and then noticed that we weren't shipping it.So I just picked it up recently, but essentially you could find the project you're interested in. Find the issues that you're interested in and mark them to save. And then manage the note taking. So if you want to take notes on like this is the maintainer to talk to, this is an example that I can leverage to solve problems or help triage things, even like-- So I was trying to contribute to the graphiql opensource project graphiql. It's like a little playground to test out GraphQL queries and trying to contribute to that was actually pretty hard. There was a lot of context I'd missed. And at the time the project itself was transitioning from Facebook's org to the GraphQL foundation.But also pretty much everybody who was becoming maintainers on the project were actually transitioning in and owning the project. So there's a rough transition into that, moving from org to org, but also the maintainer is becoming acclimated to the project as well.They were all familiar with it, but now they own it. So they are all trying to figure out the best practices and how to clean it up. So at that time that's when I was trying to contribute. So I was looking at the issues and I'm like, man, I think I could solve this one, but I'm not finding the bug's actually invalid because it seems to be fixed.So they had a ton of old issues that were just sitting there that were invalid from that because the transition cause they had fixed a bunch of stuff. They were still getting acclimated. No one went through and closed out a bunch of old issues and to close out a bunch of issues automatically with no reason or question sends the wrong signal to the users. So they just sat. And I tried, I tried working on an issue that was invalid. And I discovered that when I commented on the issue with my thought process, the maintainer or one of the maintainers came back to me and was like, Hey, this is actually invalid. All that backstory, I just told you, he told me right there on the issue. And he's also like, Hey, we also have the discord, that you should come and chat with if you want to work on anything else. And I was like, Oh, okay. That's weird. Like in the repo, there was nothing about a discord. They since added it. But then I was able to get all that context, the conversation and the questions, like, what is happening with this project? Like where can I help out all in discord. So that's like, that's sort of the summary. That story is a summary of what I'm trying to accomplish. No one, like myself needs to go into a project and be confused with skills. Like knowing that they can actually do something. To fix problems, but they don't know where to start and they don't know how to approach it because the way I do code at GitHub is different than the way GraphiQL is doing it in their repo.So that's the high level goals with some other features that we're trying to work on, but, we're always taking ideas. If you go to opensauce.pizza, that's the actual website that's live github.com/opensauced that also exists for anybody who wants to contribute or just ask questions.Open ideas, open the cool ideas, or bad ideas. It doesn't matter. Open up a discussion. We'd love to hear what problems they're facing in open source.Jeremy: [00:22:39] Do you envision this being something where the list of projects is curated or is it more somebody can pick any project on GitHub?Brian: [00:22:48] There are projects that do curation for open source projects. So GitHub has the explore feature. You can sign up for a newsletter, you get a bunch of projects every night or every week. I forget what the cadence is. And then change log has a nightly, the most popular projects. Here's a list of them, check them out. And then there's like also code triage was another project too, as well. Where you can also be curated a list of like Ruby projects or JavaScript projects as well. We do want to have curation as a feature. Like this is more there's a repo that you're using or a library that you're using. Add the library or the repo just the URL to open sauce to your dashboard. And this is all login through GitHub. It's using your own data. The backend is all the open source repo. So when you log in, you click the create the repo button, it starts tracking all your notes and all the issues in the repo itself, all open source. So once you've done that, then you have a nice tracking issue to then say, okay, I've looked at this issue, look, this issue, doesn't work, invalid or whatever. I closed this. We also track your contributions as well. So if you do any sort of PRs they'll show up in the list, but also in addition to that, it also tracks your issue contributions. So if you comment on an issue, it shows that in the list as well. So that in the eyes of open sauce, nontechnical contributions are contributions. That's another thing that I stand on, which is just because you don't have a green square for that day. Doesn't mean you didn't do anything.The platform itself, the answer to your original question-- No curation today, curation in the future, maybe it is on the roadmap. It's not actually realized in a plan. But the focus really is around if I already know the project I want to get involved in, can I just take it to open sauce and get all the information I need, digested.So I can just click the steps one, two, three, even to hammer down on that onboarding experience, like there's a project called babel they do transpiling for JavaScript for different versions. Like one of the best things you can do if you want to contribute to Babel is use Babel.I did mention triaging is another thing you can do, but if you already know how to do it and you're ready to start-- use Babel use Babel plugins, build a Babel plugin, like try going that far and seeing actually how it actually works under the hood and how you interact with the actual babel core library.So that's a recommendation and that's like a recommendation I'm actually trying to work with that team. Hopefully. I talked to him months ago, but I haven't really picked up the conversation because I wanted to focus on actually getting a dashboard working. But I would like to see as an onboarding experience, if it's like a Webpack or if it's Babel or something else, as part of my onboard experience, build a simple tool or clone or a hello world to actually get my brain wrapped around it.So that way you can confidently go in there and answer questions around, how is this broken for this user? And how can I fix it in the context of what I know?Jeremy: [00:25:39] So it sounds like coming up with-- I'm not sure what you would call it, almost like an exercise of before you contribute to this project here's a well-defined thing that you can build so you have an idea of how to tackle a real problem.Brian: [00:25:56] Yeah. Yeah. And I think it's easier for some projects than others, But I think that's on the maintainer to say, Hey, here's the contributor guide. But in addition to the contributor guide, here's the actual action items to do to get yourself up to speed. So whether it's build something on your own or just clone one of the example repos and walk through that, those are all possibilities. But it's up to the maintainer, not everybody has to have the same sort of step or guides or not everybody's working on projects on the web. But as long as you have the steps, that's all that matters. So if someone actually knows what the step is to actually get started, that's helpful.And like, we're talking about at the moment we're currently in like a existential crisis or at least America is. And there's a lot of people who have been underserved by their leaders and their community leaders and even the higher level of government. And like you go into cities. And there's a different, this like take like LA County. LA has one of the largest police forces in the United States. LA has one of the worst public school systems in the United States. I know we're talking about a political issue so I won't go too deep in that, but really what it comes down to is like actually information sharing.So if somebody who is in LA County, and working towards life skills or just like growing their career or whatnot. If they have to go to the public school system there they're going to miss out on a lot. Like there's going to be a lot of information they just don't know. And if you happen to be just one County over, which is Orange County, then you're in such a better experience. And it's such a much better step up. And I think that it comes down to like, if I want to contribute to open source and I wanted to level up my skill in my career am I getting the right information by contributing to this project? Or even using this project? I think that should be a decision that we should make as far as contributing to projects.If there are not people going in there and contributing and it's that free form, like free flowing information. And there happens to be few people who are managing the project whether it's good or bad that should be eye opening. Cause then you have one or two points of failure, like one person gets sick or has a kid, or takes time off. Then it's down to the one person left over to actually contribute. And there's nobody else in this entire developer community that has knowledge to actually contribute back. This is maybe not popular to talk about but Facebook has a lot of open source projects that we are leveraging entire products our features our companies on. Yeah, but the only people who work on that are technically Facebook employees. So is that really open source? And I know things like React, they do have contributors outside, but the individuals making all decisions are internal Facebook employees.And I know they have the best interest in the open source community. I'm picking on them because they just happen to be the example I have on top of my head, but it didn't seem like information is really flowing in back and forth. And maybe I could be corrected too. I'm happy to be corrected on that. And if there's information on the react community that allows people to onboard a lot easier than I'm all for hearing it.I'll probably do my research after this podcast cause I pulled it out of thin air and picked on them without having any sort of backing statement. But anyway, regardless, there are projects that do not have information flowing that we're supporting or we're leveraging in our projects. So whether it's react or not, we should take a hard look on, is there a proper onboarding for anybody to basically jump in there and get things done?Jeremy: [00:29:37] Yeah, I think that's an interesting point in terms of when you have companies, whether it's Facebook or any other company you have people who are being paid to work on these open source projects and ultimately the company that's paying, they want to get something that's of value to their own company. And, whether it's a benefit to the rest of the open source community is it may or may not be front of mind. So I think that that's an interesting sort of, I don't know if you'd call it a problem, but a discussion to have. How much control do companies have over the software we use and is it too much, and on the flip side, it's like, if it's not companies doing it, then it's volunteers doing it. Maybe that's an issue too, right? Like that we're relying on so much software that's being worked on by people for free.Brian: [00:30:36] That's the thing that I like discussing too as well, which is not just a onboarding decentralization of open source, like future. This might be counterintuitive to everything I said before, but when you talk about working for free, there is money being funneled into open source.And again, I apologize for picking on projects that people love and leverage in their, their projects day to day, but look at a project like Webpack and I only pick on them cause I know them and I use them. And, I know the maintainers as well, but you see the project is making half a million dollars a year, just in open collective.So that's the one location that I've looked at it and I can sort of cite today cause I've just looked at it recently. But that actually pays contributors to contract, to help solve and squash bugs. So like when you look at that, that's awesome. Actually, hats off to them and I think we should see more of that. I don't think that's a bad thing just to be clear. But what about projects like rollup or parcel or all these other bundlers and packagers and stuff like that? Those are all valuable projects, but they're not getting the same sort of funding. Are we voting by the dollars that we donate as well? And, that's another question that was asked and like, I'm not here to say that's wrong or bad. I'm happy to fund other people doing open source. Cause I think it's more about not about true open source. Like the Richard Stallman, like open source, everything type of deal. Basically what I'm getting at is that we should put our dollars where our mouth is, but also we should put our money in the things that are actually providing value and providing information and providing access to all developers as well. The best thing about this is that you have all these bootcamp grads, all these college students, coming out the gate, leveled up, and ready to ready to ship day one, which is great.There's no month long process of like, Oh, you're only stuck to doing bugs or reading articles. You can actually ship code day one because you have the GitHub account while you're in college, you have access to all open source technology. So if you want to build a quick website or a Minecraft server, whatever it is how to interact, like with stack overflow in forums and answer questions to get your job done.And, that information sharing has exploded the ability for us to grow our developer community. To be able to hire developers and train them quick. And like all bootcamp grads are only two years behind from anybody else because they just need two years of experience to actually get up to speed because the web, the mobile, everything like code changes quickly. Well, not all code not all code's the same, but I can speak for the web. The web moves quickly. So you're only two years behind the last person.Jeremy: [00:33:19] Yeah, I think that's really great that more people are getting exposed to the idea of what open source is and having the skills to be able to contribute. And what I also think is interesting is Dan Abramov, who's on the react core team and he's also the creator of Redux.He was talking on a podcast and he was saying he has all these projects that he no longer maintains that he used to work on and he feels a little guilty about it. But he was also saying that if somebody comes in and takes over those projects, some of them, when he was working on them, he was working on them at Facebook.So he was getting paid to work on them. And when you have somebody come in, who's coming in on a volunteer basis. I'm not sure the word he used. It's it's almost like they've been tricked, I guess is what he was saying is. I was working on this thing, getting paid for by my employer and somebody else is coming in and taking that on for free. And so there's this interesting imbalance in terms of the people who are getting paid to work on it and the people who aren't.Brian: [00:34:25] I mean, it's a challenge cause there's a lot of people actually able to-- GraphiQL founder or sorry, maintainer I was talking to, he's getting paid full time to work on GraphiQL. So there isn't a balance. Like he definitely is a knowledge holder, but I think that's a testament of like I spent at least two minutes, dogging Facebook, but also it's a testament to Facebook that they actually value putting open source maintainers there full time, to support the community and also even open sourcing it in general. Like there was a time when I first got into programming where you didn't open source stuff, just because like, it didn't make any sense. I talked to people at Pinterest and they open source somethings like they had a very similar front end framework, which they called Denzel. And like, maybe it was open source. I don't remember, but I'd never even heard of it until I talked to someone at Pinterest. Facebook put the time into actually promoting it, putting a conference on and actually getting people to care about it and saying like, Hey, this is actually the way to do it. They get value because then it's easier to get hired at Facebook. Despite the fact that they don't actually use React in their interviews, but it is a leg up, like you're knowledgeable that Facebook is hiring and hiring React developers or JavaScript developers, or we're even doing JavaScript at Facebook.So I would say that the value that the person who's sliding up against, Dan and working with them and getting feedback from him, he's actually getting mentorship directly from Dan. And I would say that's not a monetary value. That experience that relationship that you get is invaluable to be quite honest.And I said, I was talking about the whole LA County and the information sharing, like the more, the information is shared, the more value it's going to be like, the information I've gotten for free from just doing open source, it'd be involved in the community and going to meetups, is invaluable. I would not be here today without it. But if I relied on someone to tell me that or me reading my own blog posts or me figuring that out myself, I would not be here today. So I would say like, yes, it would be nice to have a six figure salary to work on open source every day and triage a bunch of issues. That'd be amazing. But also the fact that Dan's accessible, and makes himself accessible, I think, is what it makes the biggest difference. Dan he is a figurehead for the React community. But the fact that I can go to React. Open up a PR and get Dan or Brian Vaughn or somebody else from the team, to actually review my stuff and give me feedback and tell me what's up and make me feel comfortable. That's a big deal.Jeremy: [00:36:57] For sure. the way we learn the quickest is when it's from somebody who knows more than us, or has come before us, is able to teach us. And like you said, I think that can be really invaluable for sure.Brian: [00:37:07] Yep.Jeremy: [00:37:08] Another thing I want to talk about is you've been a developer advocate for GitHub and previously for Netlify. And I know in the past you had mentioned you had been a little hesitant to take on the developer advocate role because you were really interested in coding and engineering work.Have you ever thought about going back into a more engineering focused role or what keeps you in the advocacy role? Brian: [00:37:37] So what keeps me there is my paycheck. So I'm paid as a senior developer. That was the whole deal for me to go to, GitHub. And, that's helpful. Also. I love community. I love interacting with the community and having opportunities to be out there. I missed being able to just put on my headphones and just write some code all day, go to lunch, come back, write some code all day and then have maybe a meeting once a week, do a standup once a day. I do miss that. And I do miss that solidarity time, but also, I mean, I am a pretty outgoing person and happy to have those conversations. So like there isn't a balance, of doing that. And I think a lot of devrel folks, they come and go, not like they don't quit devrel, but they do go work on a project for awhile just to get back in the right head space, to be able to actually talk about devrel.Like one of my biggest fears from doing developer advocacy full time is that not working on a project full time on a regular basis your skills start to not keep up because I had mentioned you're only two years behind in the last thing that came out. So if you're not constantly trying new things out and seeing what's out there, then it might be harder to get an engineering job later on.I've mostly give up on the dream of climbing the engineering ladder. And I've only made that decision recently, because I think I get a better feeling around writing code when it's my own code, but also open source. So another reason why I even have open sauce is because it was a project for me to have long standing code, like learn how to write, test, learn how to use hooks in React when everybody was transitioning, like I had a project the ability to leverage and there's no pressure to ship there's no PM pushing me to like, Hey, we should have had this last week. Like I get to basically instead of sit and write code, I watch a lot of tech videos. I do a lot of screencasts. I do a lot of Twitch videos as well, so I have more freedom and less pressure to ship things. Mainly because I don't have a project that needs to be shipped constantly. So I tend to build, and I like the pattern as developer advocacy and I recommend this for anybody, like build a project that you can actually use to leverage your skills and keep that going.So whatever it is, if it's your sourdough bread making app or whatever it is to tell you when to feed your starter, Which I mean, I mentioned that because I actually want to build that, but, anyway, like build something like that so that way you can leverage and talk about on a regular basis and I think most devrel folks, they have that app for them. And I think open sauce is mine. So yeah. I guess the original question was like, yeah, I do have feelings around doing full time engineering, but I'm actually pretty content with my role today and my access to information and leveling up my skillsets.I am not spinning up Kubernetes clusters or even know how to do that. I've done it before, but like, it's going to take me a bit to refigure that out. Like just give me the one that's working on your repo and I'll go from there. And that's my approach to code, it's write it quick, get it done. And maybe write a test.Jeremy: [00:40:41] Yeah, that's cool. You may not day to day be diving in really deep on coding every day, but you keep that for yourself for your own personal project. So that you keep your skills up and plus you get to work on your own terms at your own pace. So you don't lose that joy or the the fun of just building things.Brian: [00:41:04] Yeah, and I mean, to be clear too. I do have projects that I do maintain. It's just that these are projects that I only maintain like twice a year for updates and I'm just basically having the dependabot, update them for me. And then every now and then we'll add a new feature or answer a question or something like that, but all closed source stuff to make my devrel a lot easier.Jeremy: [00:41:26] Cool. I know we're running up on time. And I just wanted to ask you one more question. Five years ago you were a new developer. you moved from Florida to the Bay area. You attended a lot of meetups and community events, and now you're on the other side, you're the one giving talks, giving presentations and talking to new developers. How do you feel like things have changed?Brian: [00:41:52] Yeah, I mean, it's changed a lot. And I think asking the question now at the time that we're in currently, I envision it's gonna change even more in the next year. But I would say when I first got the programming, jQuery was definitely a legitimate place to put all your JavaScript. CoffeeScript is probably the next level above it. And they were pretty legit things to use. I know a lot of JavaScript developers from 10 years ago probably are cringing for me saying that but you didn't have to know a whole lot. I think we had a lot of stuff that we just took for granted and we've seen a lot of security vulnerabilities because of that.So I think now-- I feel like the developer space is just leveled up in being educated in things like security, progressive web apps. So with that being said there's a lot to learn. So you can't be counted on to know everything. And that's the other thing about being a developer advocate. It's like no one knows everything.There's no pressure for me to get back in the engineering full time so I can know everything. Cause no one does, no one's perfect at backend orchestrating of servers and spinning them up in containers. And even on the front end doing CDNs, like no one's really expert on that.And I think people are really focused on things like the JAMstack where you can just pick and choose and leverage tools and free accounts that get your stuff mostly done. I think that's been a big change as well. And I think I've rode the wave in that change where I now have an entire project where I have no database.My database is literally github.com and could I have done that as easily five years ago? Probably? I roughly did it four years ago but four years ago as a junior developer. So like that goes to show we're transitioning the way that if you wanted to build something on top of a third party API or whatever, like there's a lot of tools for you to use free and I think there's a lot of VCs and a lot of founders and a lot of open source projects that are really looking at the space and looking at this sort of mock regurgitation of developer tools and how anybody has access to anything. And it's been super fascinating to see that.Jeremy: [00:43:53] Cool. Well, I know you got to get off to a meeting, so I just wanna say thanks for chatting with me today, Brian.Brian: [00:43:58] Cool. Thanks, Jeremy. Looking forward to seeing what comes out.Jeremy: [00:44:02] I hope you enjoyed the chat with Brian. If you're interested in getting an open source, you can check out the show notes at softwaresessions.com. I've got links to Brian's project, open sauced and a link to where he does his Twitch streams. The music in this episode is by Crystal Cola. Alright, I'll see you next time.

Jul 1, 2020 • 1h 6min

Senior engineers and baby managers with Lauren Tan

Lauren is a Software Engineer on the React Organization's Web Core team at Facebook and was previously an Engineering Manager at Netflix.We discuss:Being empowered to say "no" as a senior engineerStraddling the line between engineer and managerThe programmer's midlife crisisResisting the urge to use clever abstractionsIf you enjoyed this discussion with Lauren, be sure to check out her episode on the Changelog.Music by Crystal Cola: 12:30 AM / OrionRelated Links@sugarpirate_Personal SiteThe Engineer / Manager PendulumShould I Become a Tech Lead?Does it Spark Joy?Changelog Episode - Engineer to Manager and Back AgainDan Abramov's Redux twitter postTranscriptYou can help edit this transcript on GitHub.Jeremy: [00:00:00] Hey, this is Jeremy. Usually when software developers are talking career progression, it moves in the direction from being a software engineer to becoming an engineering manager. And today I'm talking to Lauren Tan who moved in the opposite direction. She was an engineering manager at Netflix, and she recently made the decision to become a software engineer at Facebook. We discuss why she made that decision and the differences between being a software engineer, a technical lead and an engineering manager. We also discuss what it means to be a senior software engineer and the ways that you can increase your impact and your influence in a software engineering role. I really enjoyed the conversation with Lauren and I hope you do as well.Hey Lauren, thanks for joining me today.Lauren: [00:00:42] Hi Jeremy. It's such a pleasure to be here. Thanks for having me.Jeremy: [00:00:45] If we look back at 2015, you're moving from Australia to Boston, you're starting your first senior developer role at the dockyard consultancy. How did you get into this position where you decided that I'm going to leave this country I live in and I'm going to start this senior developer role in Boston?Lauren: [00:01:03] A long time ago. I never really planned to leave Australia, let alone come to America. And I kind of traced his back to essentially how I got my career started in technology where really what started as a hobby creating silly applications then.In fact, one of my earliest introductions to programming was through excel and making elaborate spreadsheets and writing Visual Basic or VBA. It was something that I never really planned to do. The long story short is after college, I started a startup called The Price Geek with one of my classmates.And at the time I was getting really interested in essentially exploring more of this hobby that I had of programming and potentially exploring the idea of turning that into a career. So the year or two that I worked on that startup was really fun. We learned a lot about product development, about the business side of things, how to manage your money and how to get funding and financing.That was all really interesting. And near the end of the startup when we were basically throwing in the towel I realized that I enjoyed it so much and despite the fact that my degree was in finance and not computer science, I enjoyed it so much that I thought to myself, wow, it would be amazing if I could keep programming as a career.So I was very fortunate to get a first job in Australia as a software engineer. And I had started writing a bunch of blog posts and started sharing them on Twitter and on medium. And slowly but surely, I got people reading it. And there was a point where one of the creators of that JavaScript framework that I was writing about got in touch with me to say: Hey, would you be interested in coming to speak at one of our conferences? And of course, I was totally taken aback because first of all, I had never even been to a tech conference at that point, let alone speak at one. So I had totally no idea what I was doing. But I was convinced by them to apply. So I did and I'm very grateful that they did that.And doing all of this essentially, I started to get the attention of some of the people working in America. The CEO of that consultancy DockYard reached out to me and asked if I would be interested in working there. And at the time they were pretty well known in the field of building Ember applications, Ruby on Rails applications. And so I thought it would be pretty interesting to go and work there and learn from some of the people that I really looked up to in that community. And that was the start of my career in Boston. And really it was a difficult decision to move. I think moving anywhere it's difficult but the move from Melbourne to Boston was exceptionally hard because it's a totally different country. It's so far away. My family and my friends would be not even in the same time zone anymore in opposite ends of the world really. So that was particularly difficult. And of course the Boston weather, it's terrible. And part of the reason why I was like, I need to maybe live somewhere else is because of the terrible Boston winter that I experienced in 2015.Jeremy: [00:04:33] That makes a lot of sense how you ended up in California.Lauren: [00:04:36] Right. I was like-- I need to go somewhere warm.Jeremy: [00:04:40] One of the other guests I'm going to have on, Swyx-- he often talks about learning in public, which you were doing with your blog posts which got you noticed. So I think that's good advice for software developers in general that putting yourself out there and sharing knowledge can really make these opportunities come to you.Lauren: [00:05:01] I think it can, but I also want to say that I think that developers that learned their craft during the time that I started I think we were very fortunate in the sense that the web was a bit of a simpler place back then. People would build applications just literally using HTML, CSS, and vanilla JavaScript back then. You might just consider using jQuery or Backbone, or MooTools even.A single page application really wasn't the norm. I think today is a very different world because software development-- I don't know if it's gotten more complex, but I think at least in the world of front end development it's gotten much more difficult to just get started.Not saying that you can't build an app with just HTML, CSS, and vanilla JavaScript. But if you want to get a job doing it, then there is a bit of a higher bar I think. So I will say learning in public can be very helpful. But I also don't want to lie and disguise the fact that the environment has changed.Times have changed and things are getting slightly more complicated and complex to build and that just means that there's a bit of a higher hill to climb.Jeremy: [00:06:18] If you are going to make a site, you have so many options you have React, Vue, Ember, Svelte. There's all these different frameworks and do I use Javascript? Do I use TypeScript? It's definitely a lot more-- I don't know how you'd describe it. Intimidating, I guess.Lauren: [00:06:39] It shows the evolution of how front end development has improved in a way. It's like, it's a mindset shift I think in the industry where previously, like 10 years ago it was still okay to just build what people might call enriched documents. Really documents sprinkled with some interactivity. But these days you're often building interactive applications that warrant a framework like react or angular or svelte or Vue. So I think maybe the problems that we're trying to solve have also changed that warrant more complex solutions.Because I don't think the answer is to say like: Oh, we just need to get rid of all the complexity. The complexity exists for a reason. I think if I had advice for someone who was coming up in the industry, I would say, don't get intimidated by all these different technologies. And honestly, it probably doesn't really matter in the grand scheme of things which one you pick as long as you pick one and then you don't shut yourself off to learning from the others as well.Because frameworks will come and go, but the knowledge that you acquire from using these frameworks will hopefully stay with you for a long time. And so those are much more transferable than knowing every single detail about the React API or something like that.Jeremy: [00:07:56] Yeah, I think that's good advice. And I also wonder, when you started-- you had experience building applications in things like Rails. There are a number of frameworks where you can build a front end using primarily server side code, not necessarily build a single page application. People starting out, is that still something they should look at or do you think they should jump straight to single page applications?Lauren: [00:08:24] I feel like it depends on your goal and hopefully if you're learning to program, hopefully you also have a project or some kind of motivation for learning those technologies. You should hopefully use the right tool for the job. And if you're building something that really doesn't require a lot of interactivity, then maybe a single page application is overkill, even though it might be beneficial for you to learn.So I think it depends on your goal. If your goal is purely just for educational purposes, then by all means, choose the fanciest technology stack and learn away. But if you're actually trying to get a project going off the ground, I feel like it's probably not that useful to bike shed on like, do we use Svelte or do we write our own thing, or do we just use server side rendered templates?I think those are all fun as technologists to debate and think about. But they're just in my opinion obstacles for actually trying to do what you're trying to set out to do. So that's a bit of a roundabout way of saying that I think it depends on your goals. Is it to learn or is it to build something that you know you can get out the door really quickly. And depending on what goal you have, I think my suggestion would be slightly different.I think fortunately, if your goal is mainly just to learn, then any one of those single page application frameworks are great to pick up. My only suggestion would be, again, like not to tie yourself too closely to just one framework, even though one may seem like the incumbent. The one that every company is hiring for and that's fine.Maybe you start there, but don't let that limit you from learning everything else. Because again, like there are a lot of concepts from the different frameworks that often make their way into other frameworks as well.Jeremy: [00:10:17] That kind of reminds me of how when you first started, you were very focused on Ember. And now you're deeply involved in React you don't have to feel like you're tied to just the one you start with.Lauren: [00:10:29] Absolutely. And I think in the tech community there are a lot of these people who say: Oh, you know, don't bother learning a framework. Just learn the fundamentals. In spirit, I agree with that principle. I think that you should learn the fundamentals. But I also agree that actually learning a framework first is not a bad thing. In fact, it helps you.Sometimes you don't need to peel away all the layers of abstraction straight away because that can be very overwhelming. And I think single page applications, there are a lot of tutorials online that you can follow and you can get something working. And that is your basis for starting to then peek under the hood to say: Oh, how actually does that work?Why did I use a component here instead of make a component that does this other thing. I think of it more like the onion of knowledge really. I don't know what a good analogy is but like an onion in the sense that there are layers that you peel away and you slowly understand what the frameworks and the languages are doing.And in fact, I see even today, like my career and the stuff that I'm doing is continually peeling the layers. Maybe today I may not be working on writing an application anymore. I might be working on the infrastructure that powers the tools that allow this application to be made.But I wouldn't have been able to have gotten here if I had not been building applications before. So you go deeper and deeper. But you can't go deeper without a strong foundation. So my advice is start with what's comfortable. Start with something that's easy to learn and use that as a foundation for going deeper into the technologies and the areas of programming that you're interested in.Because maybe you'll find that front end development is not for you and maybe you'll realize that actually I prefer back end development. And that's perfectly fine. There's no one path in this industry which is pretty cool. So I would say keep it broad, learn as much as you can, and then follow what interests you and what excites you.Jeremy: [00:12:35] A lot of people when they're learning, it's hard to stay motivated unless you're building something that you can see. I think in that respect if using a framework like React or Phoenix or Rails-- if it's going to get you to the point of being able to see something working that will keep you motivated, keep you moving, then it makes a lot of sense to start there.Lauren: [00:12:57] Yeah. I totally agree. There are a lot of great concepts in these frameworks that will apply in other areas as well. Again, whether you use this framework or that framework or no framework, there are still a lot of programming patterns that you can learn. Which is why if I were to start learning how to code again, I would still start from the same place. I would still pick a framework and go with that and then figure out how it works.Jeremy: [00:13:22] I want to take us back to your time at DockYard. I believe your title was senior developer. What do you think made you a senior developer or did you feel like one at the time?Lauren: [00:13:34] I think that's a great question. I think my general viewpoint on this is that I don't think we have agreed upon standards for what we deem senior, and I don't want to be the gatekeeper of what determines someone as a senior engineer. But I certainly didn't feel like a senior developer, at least in my definition of what I thought a senior developer should be at the time.And at the time, I think I had a fairly naive impression of what a senior developer was and my thought was all about the senior developer is essentially the person who is the best programmer. Who knows every single API by heart. Is a genius at all the internals of every library that they use. And they're just technical, technical, technical chops.But interestingly, the more I worked there and the more I interacted with others, people who had the same title. The more I realized that my viewpoint of what makes somebody a senior developer or engineer was totally off. And today I feel like the technical chops are just a small part of the skillset and the tool set of a senior engineer.And if that's the only thing that you're bringing to the table, then-- it's not necessarily a bad thing, but I think you're doing yourself a disservice by not flexing those other muscles. Which is a huge lesson I learned when I took on the role of a manager. But yeah, I definitely didn't feel like I was a senior developer back then.Maybe today I feel more like a senior developer but I think everyone has this different definition. But at least in my definition, I think I feel pretty confident in saying that. Yes, I am actually a senior developer.Jeremy: [00:15:22] So what would you say were the key differences then? Because you were saying that it's beyond just the technical aspect, but what are those pieces that make you feel comfortable saying that you're senior now?Lauren: [00:15:36] First of all, it was the mindset shift for myself that I can't pinpoint a specific point in time where it happened. But I certainly recognize it today where I essentially no longer feel the need to rush into writing code. Whereas in the past the moment you get a project, you're like-- all right, I need to write this proof of concept. You just focus on writing code and that's all like for you, your impact is all about the raw output of your keyboard essentially. And that was the wrong mindset to have because what I learned over the years in working on different projects and in different companies is that oftentimes the most impactful things were not actually the result of code.It could be a conversation that you have with your customer or your client to find out the assumption that you had made was incorrect. And if there's something you can ask in a question and you can get an answer to in 30 minutes or you could spend days and weeks building something and then you bring it back and showing it to them and then they tell you why this is not what I wanted.I learned that lesson very painfully because I was one of those people who would just rush into writing code. My viewpoint was if I don't have to talk to anyone then I'm succeeding. But that was totally incorrect and it was a tough lesson to go through, but I think a lesson that I sorely needed. It's definitely affected the way I operate today.I think today I don't shy away from talking to people. In fact, I will go out of my way sometimes to have conversations with people even when it's going to disrupt the time that I enjoy of writing code because I know how impactful conversations like that can be especially when you're trying to do things that are maybe not very certain or get more context or even prioritize things.I think another aspect of being a senior developer is knowing when to say yes to things and when to say no to things. I don't think there's a decision tree for when to say no or yes. I think it's very much based on intuition and your understanding of the context and the problems you're trying to solve. And also organizational challenges that may happen, but prioritization is something I feel like we don't often talk about. Because again, if your mindset is all about my impact is based on the pure output of my coding, then you're not going to be in a position where you can say, hold on-- before I go and just jump straight into writing some code, let me actually speak with my manager and challenge the idea of like, wait, hold on. Is this actually the best way to do it? Do we even need to write any code to solve this problem? Maybe it's an organizational problem. If I were to distill it down I think it's the realization that my output is not just code anymore.And I think that for me was the point where I could say to myself: I am a senior engineer. Even though maybe I'll join a company and not immediately be an expert in all of the proprietary tools that they have, which is expected. How can you, how can anyone be an expert when you haven't used those technologies. And there's certainly no expectation I think that any company you join you will be immediately the foremost expert on something that they do within the company. So the thing that you bring from position to position or project to project is really those core skills of your understanding of fundamental programming, things that are transferable, but also the organizational chops that are also equally if not more important than those foundational skills.Jeremy: [00:19:34] Earlier in your career like at DockYard did you feel like you had the authority to ask those questions like to challenge your manager, go directly to the customer. Was there anything that was stopping you from doing that then?Lauren: [00:19:34] I think there were a number of challenges for sure. The agency relationship with customers makes it a little bit difficult because as an agency or a software consulting company you are not always in a position to question or challenge the client because at the end of the day, the client's paying you to build something very specific and sure, maybe you can point out the flaws in their plan or deficiencies but ultimately the contract that you signed states that you have to deliver a certain product by the end of a certain timeline.That was probably the systemic challenge there, but I think I also didn't feel empowered to do that anyway, even if that wasn't the case. I think there were a number of challenges, but certainly I was, maybe not having the right examples either too. Like for example, maybe if some of the more senior people in the company were doing that and setting a good example, then I think others would have followed as well. But I don't really feel like we were necessarily in a position to do so.So I think that made it more complex. And I think once I started joining a companies like Netflix or Facebook where I currently work. I think that dynamic and the expectations also changed because now I'm in a position where my job is not just to blindly output code just because someone said so, but to be a problem solver.And so I think it's a very different relationship. And I think if you are in a software consulting role or a software agency role then a lot of what I'm saying may not necessarily apply because you're not always in a position to go and question your client or customer.Maybe you might find a customer that is awesome enough to let you do that and be receptive of the feedback as well. But that's not often the case, especially on projects where it's like super tight deadline, just deliver something in two to three months. So context is everything.Jeremy: [00:21:42] Yeah. That's an interesting point about how when you're working at an agency, you're ultimately telling the customer: I will do this thing for you. It's written down in a contract, whereas for a more traditional company, it's really dependent on the culture of that company and maybe that's something that when you're interviewing or learning more about that company, you would want to figure out how much agency or how much control will you have as an engineer being in that company.Lauren: [00:22:14] Yeah. I think that's a great point. And it's a question I will often ask when I've done interviews as an interviewee in the past is ask people, especially the engineers on the team that are interviewing me for examples of times where they were empowered to say no to certain things. And I think the way that the answer those questions will tell you a lot about the culture of the company. I often find as a meta point asking for examples in interview questions to your interviewers are actually always for me, very helpful in trying to reverse engineer what the culture is really like versus what it's advertised. Because sometimes, and it's not ideal, but there's often a disconnect between what is stated and what really happens. And I think there's no better way to learn that than to ask for examples.Jeremy: [00:23:08] What are some questions you would ask to reverse engineer that?Lauren: [00:23:14] Oh, I have so many. The things that come to mind are, like I just mentioned-- can you share an example of a time where you were empowered to say no, or, tell me about a time where you disagreed with a manager and, you were given the autonomy or freedom to go and explore that solution that you were proposing. The things along those lines where it gives the interviewer a chance to show that the culture is truly what they say they claim it is. If I think of more later I'll bring them up as well. Or I can share some thoughts in writing later as well.Jeremy: [00:23:52] When you ask those kinds of questions, reading people's body language or just the way they respond you can infer a lot of information, beyond what's being said.Lauren: [00:24:06] Yeah. Especially in the times where maybe that person is unable to give you an example and instead they'll talk about it more generally, which for me is a bit of a smell. It means that maybe they don't practice what they preach. And I think what you just said is very important.I think the way that the person answers that question tells you a lot, even if you don't come out right and say like: "No, I am not empowered to say no." I think it just tells you a lot, like just how the person answers what they say, what they don't say, is also important. But I mean, I also say that with the somewhat small caveat that there may be a chance that maybe that person just hasn't had that occur to them. Maybe not that they have not been empowered to say no and they just have never had to say no. It's not necessarily a bad mark, I would say. So a lot of judgment applies to how you interpret those answers. But again, they can be so subjective. So I don't know if there's a clear cut way to say like, "Oh, this company is definitively bad or good."And then I think to make things more challenging, depending on the size of the company you're going into a lot of the culture will really depend on the immediate team that you're on. And in fact probably the manager that you have is a bigger indicator of the culture than the general company-wide culture. So it really depends. But if you have the hiring manager in your interview panel and you're given time to ask questions, then I would definitely bring lots of really hard questions and really get a sense of whether this manager will be the right person to support you in your career.And, sort of going off on a tangent here, but I think my own experience being a manager has also taught me that there are lots of different kinds of managers and it's not like one is better than the other. I think there is some kind of matching that you have to do on your own as you understand what kind of support you need.For example. If you're still early in your career, maybe you do need a manager who is very technical, who can give you a lot of technical feedback, that can help you grow in your career, at least technically. But maybe once you get more senior in your career, maybe that kind of manager would no longer be as beneficial to your career as for someone who is earlier in their career.And instead, maybe you might look for someone who is more of a sponsor. Someone who goes out and finds really difficult problems and says, "Hey, can you solve this?" And maybe that's what you need in your career. So I think spending the time to introspect and think about if I had the perfect manager, what would they be like?And then go backwards from there and say what questions do I need to ask in order to determine if this manager would be that person? And obviously it's not perfect, you can never really know for sure until you start working with them. But it can at least give you more confidence if you're interviewing at lots of different places and you're trying to make a decision on where you ultimately land.Jeremy: [00:27:17] Yeah. That's an interesting point about finding a manager fit. After DockYard, you moved to Netflix as a senior software engineering lead. In that role, what were you looking for in a manager?Lauren: [00:27:34] So I joined Netflix, not as a lead per se. It wasn't an official title, but more so unofficial title after a period of time. I guess LinkedIn probably doesn't capture that very accurately. In terms of what I was looking for in my manager at that point when I had just joined the company, I did think I was at the point when I joined Netflix that I really wasn't necessarily looking so much for just raw technical chops.I wasn't looking for a manager who was a better coder than me. I think the thing that I was excited about was Netflix's culture of freedom and responsibility and context, not control and all those things that they write on their culture memo. And I can actually safely say that pretty much most of it, if not all of it is true and they do apply in practice.So I was very excited about going into a company where the culture was so different than anything I had ever experienced. And I wanted to learn what I had started to think about would define me as being at the level that I wanted to be. You know, like not someone who is only good at programming, but also someone who brings a lot of impact to the team that they work on, whether it's a contribution in the form of code or contribution in the form of an architecture document, or even a comment or some feedback that you've given someone. Because when I first joined, I didn't feel like I was in that position yet.But about a year and a half, I don't remember exactly I did start to feel like I was getting the hang of the culture and also the technology at Netflix where I was then very comfortable when my manager came to me and said: "Hey, would you like to be the lead for the team?" to say yes.I was like, yeah, absolutely. In fact, I already felt like I was operating like a lead. So this was more just a recognition that I was already operating at that capacity. So I think that my manager at the time was definitely very supportive and they looked out for opportunities for me.And they were never really prescriptive about certain things like they may have had different opinions from me from time to time but they weren't afraid to say, you know what go try what you think is right. And then let's compare notes and see what turns out to be better.And that was always very encouraging because it creates this almost psychological safety of going and trying different things that people don't necessarily immediately agree with. Like if you can prove that something is better with a prototype or a document or whatever it might be that you're given the autonomy and flexibility in the space to go and explore that and then come back and say, you know what? This was either a good idea, or a bad idea, or unconclusive. But I think that was, for me, something that I really enjoyed about that part of my career.Jeremy: [00:30:36] And it sounds like your manager gave you the opportunity to explore having more influence, having more control over the types of work you were doing, and how you were doing it and at that time in your career, that's really what you were looking for.Lauren: [00:30:55] Yeah. I think I don't recall when, but there has definitely been times where I've had what I would call the programmers midlife crisis. Where you start questioning what you're doing and the way that you've been doing things and the purpose and starting to look up from the keyboard and like, hold on a minute.I can get this project done, but is this really the right thing to be doing? And I think the more senior you get, the more that urge will come to you and you start thinking more about, Hm. The moments where you say to yourself like, hold on a minute, something feels off.And I think the turning point for a lot of people will be when you'll start turning those thoughts into action and instead of just saying, hold on a minute in your mind and then just continuing anyway, you start actually going forward and talking to people and say, hold on, here's something that doesn't sit quite well with me. Let's talk about it. And in fact, I think one of the things I started to recognize once I was operating in that lead capacity even though maybe I didn't have the title just yet, was that actually I was spending less time coding.And initially it felt kind of awkward. I was like, why am I in all these meetings? Why am I feeling like my output has dropped a lot? And it was true. If the only output that you're measuring is my code, then it definitely dropped quite a lot. But in terms of the impact I was having on the team and the projects that I was on I think definitely outweighed that. It wasn't a net loss because oftentimes when you have someone who's operating in a lead capacity, it means that in a way, they're giving away those problems that are maybe more difficult to solve. And allowing others to learn about them and, not hogging all the difficult pieces to themselves which sometimes the tech leads might do instead of giving opportunities to others to grow, which is actually a responsibility for a tech lead. So I think going back to your question of what did I need from my manager at the time, I think it was definitely being put more so than a lot of other things it was being put in an environment where I could really flex those nontechnical skills and understand, and almost in a way, create the environment you know? Like if a manager is like a gardener, then creating the right conditions in the environment so that I could not just thrive, but also evolve and grow and, broaden my branches. It's a weird analogy.Jeremy: [00:33:38] And, we've stepped around it but I think the the title of someone being a lead to a lot of people is a little fuzzy. Some people think a lead is the same thing as a manager. And it sounds like what you're saying is in your case, a lead was someone who is able to ask questions to figure out what should actually be built. They're able to decide who should work on these things after you've decided what needs to get built, and we haven't mentioned this, but potentially help the people who are building these things if they get stuck. Would you say those are the three primary things that a lead does?Lauren: [00:34:21] Um, at a high level, I think that's pretty accurate. To be a bit more granular, I would say it also depends on the kind of tech lead that you want to be, or, or maybe another way to put this might be the tech lead that your team needs. Because the truth is, at least from my perspective, just like managers have different archetypes. I would also say that tech leads had to have different archetypes and it really just depends on the kind of project that you're working on. I would say though, as a minimum for me, at least from the technical side of things, yes. Even though I wasn't writing a lot of code anymore as a lead, I was still reviewing a lot of code.In fact, I think I would probably say I reviewed more than I wrote code. I think that was also part of the dawning realization that-- Hey, you know what? You can contribute in forms that aren't just you writing the code and then slowly universe expanding of like, Oh, if I step back just a little bit, I start to see the forest of what impact as an engineer is. And it was the realization that I had been only just focusing on this technical tree and not growing all these other skills that are also really important. So I think tech leads are typically the people who are seen as the best engineer who gets pushed into the lead position.But I would say that tech leads are interesting in the sense that you're not a manager, you typically don't have reports, and you don't have any authority so to speak, over anyone. So all you really have typically, I feel is the influence that you've earned throughout your career in that company.And that kind of social capital, if you will, that people will start to listen to you because you've been around, you know your way around, and you've proven that you can handle large projects and things like that and grow other engineers. So I think for me being a tech lead in some ways can be actually more challenging in some ways than a manager because it's blurring the lines I guess. I think as a tech lead you're in this awkward gray zone between engineer and manager and you're not quite one. You're not quite a manager. You're obviously still an engineer, but you're in a position of greater influence and greater not really authority, but more respect typically is given to you.And so you're in this awkward position. Where it again, it comes down to what your team needs. And maybe like for example, if I was to join a new team and I was the tech lead for that team and if it was a team of one or two people, then obviously the expectations and the way I would do my job would be very different from me joining as a tech lead on a team of 12 engineers. It's a very different set of variables that you have to learn how to tweak. And again, it just depends on the makeup of the team as well. So like if I joined a team of 12 very junior engineers then also my approach would be very different versus if I joined a team of 12 extremely senior engineers. It all is very fuzzy. I don't think there's one, there's no one way to do your job as a tech lead, or as an engineer, as a manager. And maybe it sounds like a bit of a cop out answer, but I do think that a lot of questions can be distilled down to the age old answer it depends. Obviously just saying it depends and nothing else is a bit of a cop out. But I can say that there are different circumstances and some may require as a tech lead more involvement from you at the architecture level. Maybe some less or maybe some where you, instead of worrying too much about architecture, maybe the problems are more around organizational challenges or headcount or constraints I would imagine things like that the tech lead should be doing as well. Like that example I shared with you of joining a team and being one of two engineers. Maybe one of the first things of my job would be to point out to leadership that-- Hey, I've just joined this project and it's clearly very ambitious, but there's only two of us and the deadline-- that the timeline that we're gonna work on is way too unrealistic.So I actually need to campaign my manager to say, this is why we need another two engineers or one engineer on this project. And so that's why I think it's a bit tricky. Cause it really depends on the team that you're on.Jeremy: [00:39:05] That's a really good point in terms of the size and the experience and the actual project that you're tackling. I think that's why people have so much trouble understanding what it is a tech lead does. Because from what you're describing, it's a completely different job from person to person.Lauren: [00:39:24] Yep. Yeah. It's very context dependent because you're straddling the line between manager and engineering and individual contributor. And so you have to sometimes wear the manager hat even though you're not a manager, and sometimes you have to wear the engineer hat.But I think knowing when to switch hats is really important. And if the expectation that maybe someone else has set for you is that you are wearing the technical hat most of the time, then that's the expectations that you work towards. But I think for most companies, especially the bigger ones, I think there is an expectation that you also wear the project management hat, the organizational hat where you go and raise problems like that as well.Jeremy: [00:40:10] So we've been talking about tech leads and managers and how the role of a tech lead is so fuzzy. After your role as a lead at Netflix, you moved on to becoming a manager.What would you say are the key differences between being a manager and a tech lead? How did your job change, how did your role change?Lauren: [00:40:33] So I do want to make the distinction that even though I said earlier that as a tech lead, you're in-between a manager and an individual contributor. I do want to say there are a lot of manager specific things that tech leads don't get exposed to.It was definitely a big jump even going from tech lead to manager, let alone like as a non tech lead engineer to manager. And I think a lot of those challenges were in the forms of essentially problems which I never really thought about.Like, maybe I would have said-- we need more people on this project. But I wouldn't have then gone on to say, all right, I need to spend the next three months looking for the perfect hire to join the team. Because that was the job of the manager, to really think about the people and the conditions of the overall team that they're supporting.Whereas as a tech lead your sphere is slightly more constrained to maybe a project or two purely more in terms of the results of said project, whereas I think as a manager, the expectations become more at the organizational level and your success is really determined by the work that your team ultimately does or doesn't do.Maybe it sounds kind of subtle when I describe it that way, but I will say that it was definitely a very different job when I went on to become a manager. The first of which is I think a very false conclusion that I may have harbored a long time ago and I think a lot of people share the same sentiment as well. Which I want to go on record and say I kind of disagree with that. And that view is essentially that becoming a manager is a promotion. In some ways, maybe it is a promotion, like maybe financially you might get paid more and you might have more opportunities to have certain kinds of impact depending on the company that you're in.But I will say for the most part having that mindset that management is a promotion is not the right one to have because I think it disguises the fact that when you go from engineering to manager you're basically going from very senior engineer who was very good at their job to going to become a baby manager who knows nothing about their job.So it is very different. They're like two definitely distinct tracks and all the skills that you've used. Like, 99% of them, you've used to be successful as an engineer are probably not going to translate that well to being a manager. Like you're not going to be expected to write as much code or to even do code reviews instead your role is really more to ensure that you have the right people on your team, and also the right environment where those people can thrive and do their best work and achieve the goals that have been set out for your team and even shape those goals that your team should be working on. So it was definitely a very different career change.And I think even though I had expectations going in, that it'd be very different. It was a totally different experience doing it, if that makes sense. Like I was expecting one thing. I knew it would be different, but I didn't realize it'd be that different. And I remember as a manager, I was spending so many hours just looking through LinkedIn or reaching out to people on Twitter, and asking them like, "Hey, would you want to come work on my team?"Because as a manager, your biggest lever for impact is getting to pick who is in the team and it may sound like a very-- it may sound simple, right? Like you're just hiring, but I would say it's actually a very very high leverage activity. If you find that person who fills out a gap in your team where maybe there's a certain skill set or a certain technology, skill or organizational skill that your team doesn't have that you want to have, and you're able to fill that position and not just do that, but keep them there as well for-- well not keep them there but, to create an environment where that person is happy staying for awhile, then you've really done a great job because now you have a strong, solid dream team that has the capabilities of doing awesome work that you need, that you want to achieve the vision that you have for your team. And then you also have to balance that with the often difficult work of what is often called talent retention, but I don't really like that term because I don't think so much about retaining people because that sounds to me like they're constantly trying to escape and then you're just trying to hold them back.I think it's more about creating an environment where people are attracted and they want to stay not because they're handcuffed. But because they choose to stay, because you're a great manager and the team is good. That the work is impactful. If anyone listening is also going through that transition or you've just become a manager, I'll say that I think for me the biggest challenge I had to overcome in the initial couple of months of making this transition was really understanding that it was a completely different job and then changing all the things that I did for that new reality and not trying to go back to the things and the skills and activities that I had been relying on as a tech lead.Jeremy: [00:46:10] You gave a few examples of what a manager does that wouldn't go to a tech lead like hiring and if your team needs more resources, making that pitch to get more money, and also creating an environment where people want to stay there. Are there any other specific examples for the types of things that would only be a manager's role versus something that a tech lead would do?Lauren: [00:46:36] Yeah. So this is not going to be an exhaustive list for sure, but I can at least point out the things that are not immediately obvious, not unless someone explicitly says it. But I will say actually, depending on the company, I think one of the biggest jobs of a manager is actually the flip side of hiring, which is firing.And that's a really tough one. You never go into a team or hire someone or work with someone on your team expecting that you'll need to one day fire them. But as a manager, that is the thing that you think about on a constant basis, not because you just like firing people just to fire people or again, I don't know, maybe at certain companies where they do things like stack ranking and there's an expectation that, yeah, I don't know. I've been fortunate not having worked in companies like that. But if you are a manager, I would say that you're often the person who does a lot of unglamorous things like that. To me at least, it seems unglamorous to me. The hard work of recruiting and hiring and speaking to candidates and selling them on your team.And if you do write code it's most likely going to be the very boring parts of the code base. Like adding tests or writing a little script that does a certain thing. So you're not going to be working on those things that you thought were exciting or the things that may have even attracted you to software engineering in the first place.So very different job. And, there is going to be so many other things that you may not be aware that managers have to do. Like, I don't really like how people will phrase this, but a lot of managers will say like they provide air cover or they shield their team from shit.I think there's some truth to that in the spirit of what that means. But I think there are obviously different ways of approaching it. And personally for me, instead of thinking it like that, like I'm shielding my team from shit, I think it's more about maybe there is shit coming towards our way. But my job as a manager is to also not make that shit before it comes to my team, if that even makes sense. And so that often means talking to people in different positions, different parts of the company, people who are higher, like a VP or a director and convincing them that this path isn't the right one.And the truth is that a lot of individual contributors won't see that. Not because they are ignorant but because if your manager is doing a good job of that, you just don't see it. And sometimes managers can be a bit flippant, I think. And to say that, Oh yeah, I shield all the shit, so you don't have to, which I think is, again, like in spirit it, it captures the outcome of what, what that is. But I think it also doesn't quite accurately portray how the manager goes about doing it cause there are many different ways that yes, maybe you can just shield and keep your team unaware of everything, but that's not necessarily a great way to run your team as well because your reports will not necessarily trust you very much if you're always being very dishonest and like not telling them the truth, because of your desire to shield them from shit.Instead maybe the better approach is to let people know that. You know what, there are rocky things that are coming. They're things that the company that we work at doesn't do so well, and that's okay. We'll figure it out. But not to completely hide it, I think. I think that's the part which I am not a big fan of.It's a bit of a cop out to me if the manager just keeps things from their team because of that mindset or because of that belief that by doing so, they are helping your team, when in fact, I think it's actually making the team worse off.Jeremy: [00:50:25] That's an interesting perspective because ultimately, if you are shielding your team and the things that they're being shielded from are just shifting elsewhere in the organization, that's not really solving the root problem.Lauren: [00:50:39] Right. Yeah, and I think it can also be very powerful for managers to point out areas which need help. And then instead of feeling like the manager has to solve all of those problems I think we-- we talk a lot about the management parts of the job, but not the leadership parts of the job.Leadership is really more about influence and the way you conduct yourself and how others perceive your behavior versus management, which is more like-- I think a role that you play. So things like hiring and firing are obviously the role of a manager.But getting people excited about a vision and getting people to do certain things even though you're not explicitly bending their arm to do it is a part of the job that is not often talked about or even taught-- like how do you do that? It's not something that you can just read a book and do. It's something that over time and trial and error and maybe some intuition you build that up over time.A realization that I had over the past two and a half years when I was a manager was that leadership is not solely within the domain of the manager. It sounds silly to say this, I had to be a manager in order to realize this. But it wasn't as evident to me until I became a manager that, Hey, hold on. There's a lot of things that I'm doing as a manager that I didn't have to be a manager to do these. So when I started to think about the different parts of the job that I was doing, I started to realize that, hold on, there's the parts that I like, right? A lot of the leadership side of things I really enjoyed.And then the other parts which I maybe didn't enjoy as much. And I realized like, Oh, hold on. Actually I don't necessarily have to be a manager to practice these skills. That was actually the realization that I needed to maybe go back to be an engineer again.But I certainly don't regret the time I spent as a manager cause I was exposed to so many different kinds of problems that I'd never ever had to face as an engineer. Hiring, having to let people go. Dealing with the sometimes unreasonable demands of different organizations that we were working with and balancing that all.And another thing that maybe managers don't talk about is oftentimes people will come to you with problems that you can't solve. And these are maybe personal problems, emotional problems. And if you're a very empathetic person, then I think the job of a manager gets really difficult because people come to you with lots of problems that you can't solve. And if you're an engineer, you probably want to try and solve all the problems and it can be very frustrating.I guess I'll sum it up all by saying that being a manager is a totally different job from being an engineer, even a tech lead. It's totally different. It's not a promotion. I don't consider it a promotion. And I think if anybody chooses to do it, I think you learn a lot and hopefully you enjoy that transition as well.But personally for me, I didn't enjoy it. That doesn't make being a manager bad. It just means that it wasn't for me.Jeremy: [00:54:00] And now that you're at Facebook as a software engineer again, What's the thing that you enjoy most about being a software engineer as opposed to being a manager?Lauren: [00:54:11] I think when it came down to it. It was really a reflection of why was I in the tech industry in the first place. And I think the simple way to put it is that I mentioned earlier at the start of this conversation that programming started out as a hobby for me. And it was something that I would spend all my free time just working on.I would have these shower thoughts essentially of programming. And I realized I've been so fortunate that I was able to turn something that was purely a hobby into a full time career. And when I was reflecting at the end of 2019 about the next couple of years of my career, I did really start to think that there was a lot of programming in being an engineer that I really missed.And also me realizing that other part, which I mentioned earlier, which is that there's a lot of things I was doing as a manager that they were not things that only managers could do. But you may have to become a manager to learn the skills, which just sounds kind of weird.But I think it was that realization that Hey, one, I can go back to do what I love, which is programming. And two, I can also bring back all these lessons that I've learned as a manager and basically supercharge myself as an engineer and be so much more impactful. Not because I'm going to write all the code and solve all the problems, but because I know how to inspire. I know how to influence. I know how to communicate. I know how to get things done and get other people to help out with those problems. And I think that was for me, the realization that I could have my cake and eat it too, I guess. And I think that I'm very fortunate in that.I think at Facebook they think very heavily about career paths as an engineer, as a manager. And I think the company does a pretty good job at stating that one is not superior to the other. In fact, there's more or less an identical leveling track for engineers and managers and also very similar in terms of compensation.So there's not a penalty for you if you become an engineer. It's not like you're going back. It's seen more as you are just hopping over to the other parallel track. And one of the blog posts that I want to call out here that really helped me think about my career this way is a blog post by Charity Majors called the engineer/manager pendulum.And she does an amazing job of articulating this hidden career path of jumping between the engineering track and a management track every couple of years. And she does a way better job than I think I can to explain why it's an interesting career path to take, but it certainly inspired me to start thinking more critically about what I wanted out of my job.And then finally mustering the courage to go and interview again because I don't actually know anyone that I can think of who actually enjoys interviewing. I don't. I think it's one of those evils that we put up with. So there is some courage you have to muster up often just to interview and go look elsewhere. But I think her blog posts really spurred me to take action on it.Jeremy: [00:57:34] The interviewing problem is-- that sounds like maybe a job for a manager?Lauren: [00:57:43] Yes. I think, yes, it is part of the manager's job, but I think as engineers, we can also do a lot to at least point out the problems.Maybe we're not the ones to fix them, but we can at least say, Hey, this interview panel that I'm on-- I've looked at the other interviewers on the panel and you can call out things that aren't quite right, that don't sit right to you.Maybe the panel is very undiverse or maybe the interview goes on for like eight whole hours. There are things like that you can still do to influence that process and even influence the questions that get asked. I haven't been a part of an interview panel here yet, but if I understand it correctly, I think that engineers have a lot of influence over the kinds of questions that set a standard for the different interviews that we have.And so that's one way as well, to have a lot of impact and influence over the interview process. And make sure that the questions that we're asking are relevant, realistic, but also ensures that we keep that standard of engineering quality that we want, which is always a fine balance to strike.I could probably talk to you for another whole two hours just on the topic of interviewing, so I won't go into that right now.Jeremy: [00:59:01] Yeah. It's definitely something that everybody has an opinion and everybody agrees it needs to be better, but for some reason we as an industry just haven't gotten there yet.Lauren: [00:59:12] I think my short answer to this is that I don't think there is a perfect solution. Which is why we haven't as an industry adopted something that's better. It's a process that is very lossy and there's just no way to really tell in a short frame of time what a person will be like working on a job.And there are many ways to solve it. None of which I think is better than another. So that's all I'll say about that topic. Don't get me started.Jeremy: [00:59:43] Yeah, I think that anybody listening to this maybe the big takeaway would be regardless of what your role is, even if you are just a regular software engineer, look for what are the places where you can ask questions, whether that's what type of work you're doing, whether the technology you're using is right, or do you have the right people to do it?What are ways that you can really improve your team situation without necessarily having to change titles.Lauren: [01:00:16] Yeah. I think that's a great way to put it. The way I would try to summarize my learnings over the years is really that it comes down to ultimately for me, it all stems from this root. And I think the root of this is all of that. Realizing that your job is not to write code. Code is merely a side effect of your job.I think your job is really to solve problems and there are many ways to solve problems and I think realizing that is, to me, step zero in terms of growing more senior in your career. And, the other thing I'll say to that is also as you get more senior things will get more ambiguous. And you have to learn how to deal with that uncertainty and ambiguity and accepting that sometimes there isn't an answer.And that's okay. I think those are the two big lessons that I've learned.Jeremy: [01:01:10] That's interesting because I think as engineers, a lot of people feel like as they learn more things will get less ambiguous. But it sounds like as you progressed in your career, things are actually getting more ambiguous and that's how you know you're progressing.Lauren: [01:01:24] Exactly. Yeah. I think even in the code I write. Cause like I think you can see it sometimes. I've seen this in myself as well. When you're not a junior engineer anymore but you're not really senior and you kind of know enough to be dangerous and you start dreaming up these-- I'm very guilty of this in my past of writing these weird abstractions that you think will save you a lot of time. But when you look at them in a couple of months, you realize this was totally the wrong abstraction to have picked and it's actually slowing the team down.That is often because again, you're trying to feel your way around and explore and learn, and write better code in your mind. But I find myself these days, it's like trying to write the simplest possible code and delaying the point of abstraction as much as possible and writing a lot of comments about all right-- This could be better, but I'm not gonna make it more abstract right now because this is just a one off case and we don't actually know for sure if it's going to happen again. So that's, I think, the part of recognizing the ambiguity of things. And there are a lot of things that have subtly changed about my behavior.I used to be all about talking about best practices and talking about, Oh, this is an anti pattern, or, so and so said, we shouldn't do it this way. Or you try to read the tea leaves of someone's tweets into Oh yeah, Dan Abramov says, don't do this. So this is now law and we cannot break this law. But I think a painful but necessary part of growth I think is realizing that nothing is really an absolute, only the Sith deals in absolutes and being comfortable with that. Like, again, I was saying like there's often maybe no best answer, but picking the right tool for the job, the right solution, it takes a lot of patience and communication together with your team.Jeremy: [01:03:20] For sure. Yeah. Dan Abramov's example is actually really funny cause he is the creator of Redux, right? And he has this tweet where somebody is describing like how somebody put Redux into their application because Dan said to do it and he replies to the tweet and says this is the reason I'm going to hell.Lauren: [01:03:41] Yup. Yeah. Dan Abramov is a really smart person and someone I really enjoy working with. I think it's all part of our growth of realizing that the things that maybe we all believed were best practices a year ago are probably now anti-patterns, which is why I just shy away from saying this is the best practice and we must do it this way. And taking a more case by case basis to things. And again, this all ties back to being comfortable with ambiguity, right? Because if you don't have these laws, so to speak. Then you're introducing a lot of ambiguity in your code because now maybe people have a lot of uncertainty about, Oh, do I use this in this situation or that?And instead of you saying, Oh, you should always use this thing. You're now saying, right, let's evaluate it on a case by case basis. And that's okay. Maybe it's going to slow it down a little bit but in the long run, it actually makes us faster and more resilient to change. Especially if product requirements change and suddenly all the abstractions that you dreamed up are now totally irrelevant.It's a very interesting industry to be in. I think software it's, it's changing all the time. The way we build software also has to reflect that. And instead of trying to build these very rigid architectures and constructs-- which maybe in certain scenarios are warranted.Like if you're writing code that will never be updated for the next 30 years, then it probably makes sense to get it right from day one. But if it's something that's constantly being improved and evolved, then maybe you don't, you don't jump into pouring the concrete where the concrete doesn't belong just yet.Jeremy: [01:05:22] Yeah, I think that's a good note to end it on. Where can people follow you?Lauren: [01:05:27] The best place to follow me will be on Twitter. My handle is sugarpirate_. You can also follow me on LinkedIn, or add me on Facebook but Twitter is probably your best bet if you're trying to get ahold of me.Jeremy: [01:05:42] Lauren, thank you so much for chatting with me today.Lauren: [01:05:44] Yes. And thank you Jeremy. It was really fun talking to you. And see ya everyone!Jeremy: [01:05:47] That's it for my chat with Lauren, You can get show notes and a transcript for this episode at softwaresessions.com. And if you enjoyed the show, let someone else know about it. The music in this episode is by Crystal Cola. See you next time.

Jun 17, 2020 • 1h 31min

Learning in Public with Swyx

Swyx, a senior developer advocate at AWS and author of The Coding Career Playbook, discusses getting help without a big following, remixing and summarizing others' work, creating Friendcatchers, betting on technologies, and his new book.

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

App store banner

Play store banner