Environment Variables cover image

Environment Variables

Latest episodes

undefined
May 15, 2025 • 50min

Why You Need Hardware Standards for Green Software

Chris Adams is joined by Zachary Smith and My Truong both members of the Hardware Standards Working Group at the GSF. They dive into the challenges of improving hardware efficiency in data centers, the importance of standardization, and how emerging technologies like two-phase liquid cooling systems can reduce emissions, improve energy reuse, and even support power grid stability. They also discuss grid operation and the potential of software-hardware coordination to drastically cut infrastructure impact. Learn more about our people:Chris Adams: LinkedIn | GitHub | WebsiteZachary Smith: LinkedIn | WebsiteMy Truong: LinkedIn | WebsiteFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterResources:Hardware Standards Working Group | GSF [06:19]SSIA / Open19 V2 Specification [12:56]Enabling 1 MW IT racks and liquid cooling at OCP EMEA Summit | Google Cloud Blog [19:14] Project Mycelium Wiki | GSF [24:06]Green Software Foundation | Mycelium workshop EcoViser | Weatherford International [43:04]Cooling Environments » Open Compute Project [43:58]Rack & Power » Open Compute Project Sustainability » Open Compute Project 7x24 Exchange [44:58]OpenBMC [45:25]If you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:Zachary Smith: We've successfully made data centers into cloud computing over the past 20 or 25 years, where most people who use and consume data centers never actually see them or touch them. And so it's out of sight, out of mind in terms of the impacts of the latest and greatest hardware or refresh. What happens to a 2-year-old Nvidia server when it goes to die? Does anybody really know Hello, and welcome to Environment Variables,Chris Adams: brought to you by the Green Software Foundation. In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software.I'm your host, Chris Adams. Hello and welcome to Environment Variables, the podcast where we explore the latest in sustainable software development. I'm your host, Chris Adams. Since this podcast started in 2022, we've spoken a lot about green software, how to make code more efficient so it consumes fewer resources or runs on a wider range of hardware to avoid needless hardware upgrades, and so on.We've also covered how to deploy services into data centers where energy is the cleanest, or even when energy is the cleanest, by timing compute jobs to coincide with an abundance of clean energy on the grid. However, for many of these interventions to work, they rely on the next layer down from software,the hardware layer, to play along. And for that to work at scale, you really need standards. Earlier this year, the SSIA, the Sustainable and Scalable Infrastructure Alliance, joined the Green Software Foundation. So now there's a hardware standards working group for HSWG within the Green Software Foundation too.Today we're joined by two leaders in the field who are shaping the future of sustainable software. So, oops, sustainable hardware. We've got Zachary Smith formerly of Packet and Equinix, and My Truong from ZutaCore. We'll be discussing hardware efficiency, how it fits into the bigger sustainability picture, the role of the Open19 standard, and the challenges and opportunities of making data centers greener.So let's get started. So, Zachary Smith, you are alphabetically ahead of My Truong, Mr. Truong. So can I give you the floor first to introduce yourself and tell a little bit about yourself for the listeners?Zachary Smith: Sure. Thanks so much, Chris. It's a pleasure being here and getting to work with My on this podcast. As you mentioned, my name's Zachary Smith. I've been an entrepreneur, primarily in cloud computing for, I guess it's about 25 years now. I went to Juilliard. I studied music and ended up figuring that wasn't gonna pay my rent here in New York City and in the early two thousands joined a Linux-based hosting company. That really gave me just this full stack view on having to put together hardware. We had to build our own computers, ran data center space, oftentimes helped build some of the data centers, connect them all with networks, travel all around the world, setting that up for our customers. And so I feel really fortunate because I got to touch kind of all layers of the stack. My career evolved touch further into hardware. It just became a true passion about where we could connect software and hardware together through automation, through accessible interfaces, and other kinds of standardized protocols, and led me to start a company called Packet, where we did that across different architectures, X86 and ARM, which was really coming to the data center in the 2014/15 timeframe. That business Equinix, one of the world's largest data center operators. And at that point we really had a different viewpoint on how we could impact scale, with the sustainability groups within Equinix as one of the largest green power purchasers in the world, and start thinking more fundamentally about how we use hardware within data centers, how data centers could speak more or be accessible to software users which as we'll, unpack in this conversation, are pretty disparate types of users and don't often get to communicate in good ways. So, I've had the pleasure of being at operating companies. I now invest primarily businesses around the use of data centers and technology as well as circular models to improve efficiency and the sustainability of products.Chris Adams: Cool. Thank you Zachary. And, My, can I give you the floor as well to introduce yourself from what looks like your spaceship in California?My Truong: Thanks. Thanks, Chris. Yes. So pleasure being here as well. Yeah, My Truong, I'm the CTO at ZutaCore, a small two-phase liquid cooling organization, very focused on bringing sustainable liquid cooling to the marketplace. Was very fortunate to cross over with Zach at Packet and Equinix and have since taken my journey in a slightly different direction to liquid cooling. Super excited to join here. Come from, unfortunately I'm not a musician by a classical training. I am a double E by training. I'm joining here from California on the west coast of the Bay Area.Chris Adams: Cool. Thank you for that, My. Alright then. So, my name is Chris. If you're new to this podcast, I work in the Green Web Foundation, which is a small Dutch nonprofit focused on an entirely fossil free internet by 2030. And I'm also the co-chair of the policy working group within the Green Software Foundation.Everything that we talk about, we'll do our best to share links to in the show notes. And if there's any particular thing you heard us talking about that you're really interested that isn't in the show notes, please do get in touch with us because we want to help you in your quest to learn more about green software and now green hardware.Alright then looks like you folks are sitting comfortably. Shall we start?Zachary Smith: Let's do it.Chris Adams: All right then. Cool. Okay. To start things off, Zachary, I'll put this one to you first. Can you just give our listeners an overview of what a hardware standards working group actually does and why having standards with like data centers actually helps?I mean, you can assume that our listeners might know that there are web standards that make websites more accessible and easier to run on different devices, so there's a sustainability angle there, but a lot of our listeners might not know that much about data centers and might not know where standards would be helpful.So maybe you can start with maybe a concrete case of where this is actually useful in helping make any kind of change to the sustainability properties of maybe a data center or a facility.Zachary Smith: Yeah. That's great. Well, let me give my viewpoint on hardware standards and why they're so critical. We're really fortunate actually to enjoy a significant amount of standardization in consumer products, I would say. there's working groups, things like the USB Alliance, that have Really provided, just in recent times, for example, standardization, whether that's through market forces or regulation around something like USB C, right, which allowed manufacturers and accessories and cables and consumers to not have extra or throw away good devices because they didn't have the right cable to match the port.Right? And so beyond this interoperability aspect to make these products work better across an intricate supply chain and ecosystem, they also could provide real sustainability benefits in terms of just reuse. Okay. In data centers, amazing thing, being that we can unpack some of the complexities related to the supply chain. These are incredibly complex buildings full of very highly engineered systems that are changing at a relatively rapid pace. But the real issue from my standpoint is, we've successfully made data centers into cloud computing over the past 20 or 25 years, where most people who use and consume data centers never actually see them or touch them. And so it's out of sight, out of mind in terms of the impacts of the latest and greatest hardware or refresh. What happens to a 2-year-old, Nvidia server when it goes to die? Does anybody really know? You kind of know in your home or with your consumer electronics, and you have this real waste problem, so then you have to deal with it.You know not to put lithium ion batteries in the trash, so,you find the place to put them. But you know, when it's the internet and it's so far away, it's a little bit hazy for, I think most people to understand the kind of impact of hardware and the related technology as well as what happens to it. And so that's, I'm gonna say, one of the challengesin the broader sustainability space for data center and cloud computing. One of the opportunities is that maybe different from consumer, we know actually almost exactly where most of this physical infrastructure shows up. Data centers don't move around usually. Um, And so they're usually pretty big. They're usually attached to long-term physical plants, and there's not millions of them. There's thousands of them, but not millions. And so that represents a really interesting opportunity for implementing really interesting, which would seem complex, models. For example, upgrade cycles or parts replacement or upskilling, of hardware. Those things are actually almost more doable logistically in data centers than they are in the broader consumer world because of where they end up. The challenge is that we have this really disparate group of manufacturers that frankly don't always have all the, or aligned incentives, for making things work together. Some of them actually define their value by, "did I put my, logo on the left or did I put my cable on the right?" You have, a business model, which would be the infamous Intel TikTok model, which is now maybe Nvidia. My, what's NVIDIA's version of this?IDK. But its 18 month refresh cycles are really like put out as a pace of innovation, which are, I would say in many ways quite good, but in another way, it requires this giant purchasing cycle to happen and people build highly engineered products around one particular set of technology and then expect the world to upgrade everything around it when you have data centers and the and related physical plant, maybe 90 or 95% of this infrastructure Can, be very consistent. Things like sheet metal and power supplies and cables and so like, I think that's where we started focusing a couple years ago was "how could we create a standard that would allow different parts of the value chain throughout data center hardware, data centers, and related to, benefit from an industry-wide interoperability. And that came to like really fundamental things that take years to go through supply chain, and that's things like power systems, now what My is working on related cooling systems, as well as operating models for that hardware in terms of upgrade or life cycling and recycling. I'm not sure if that helps but, this is why its such a hard problem, but also so important to make a reality.Chris Adams: So if I'm understanding, one of the advantages having the standards here is that you get to decide where you compete and where you cooperate here with the idea being that, okay, we all have a shared goal of reducing the embodied carbon in maybe some of the materials you might use, but people might have their own specialized chips.And by providing some agreed standards for how they work with each other, you're able to use say maybe different kinds of cooling, or different kinds of chips without, okay. I think I know, I think I know more or less where you're going with that then.Zachary Smith: I mean, I would give a couple of very practical examples. Can we make computers that you can pop out the motherboard and have an upgraded CPU, but still useLike the rest of the band. Yeah.the power supplies, et cetera. Is that a possibility? Only with standardization could that work. Some sort of open standard. And standards are a little bit different in hardware.I'm sure My can give you some color, having recently built Open19 V2 standard. It's different than the software, right? Which is relatively, I'm gonna say, quick to create,quick to change.And also different licensing models, but hardware specifications are their own beast and come with some unique challenges.Chris Adams: Cool. Thank you for that, Zach. My, I'm gonna come bring to the next question to you because we did speak a little bit about Open19 and that was one thing that was a big thing with the SSIA. So as I understand it, the Open19 spec, which we referenced, that was one of the big things that the SSIA was a kind of steward of. And as I understand it, there's already an existing different standard that def, that defines like the dimensions of like say a 19 inch rack in a data center.So, need to be the same size and everything like that. But that has nothing to say about the power that goes in and how you cool it or things like that. I assume this is what some of the Open19 spec was concerning itself with. I mean, maybe you could talk a little bit about why you even needed that or if that's what it really looks into and why that's actually relevant now, or why that's more important in, say, halfway through the 2020s, for example.My Truong: Yeah, so Open19, the spec itself originated from a group of folks starting with the LinkedIn or organization at the time. Yuval Bachar put it together along with a few others.As that organization grew, it was inherited by SsIA, which was, became a Linx Foundation project. What we did when we became a Linux Foundation project is rev the spec. the original spec was built around 12 volt power. It had a power envelope that was maybe a little bit lower than what we knew we needed to go to in the industry. And so what we did when we revised the spec was to bring, both 48 volt power, a much higher TDP to it, and brought some consistency the design itself.So, as you were saying earlier, EIATIA has a 19 inch spec that defines like a rail to rail, but no additional dimensions beyond just a rail to rail dimension. And so what we did was we built a full, I'm gonna air quote, a "mechanical API" for software folk. So like, do we consistently deliver something you can create variation inside of that API, but the API itself is very consistent on how you go both mechanically, bring hardware into a location, how you power it up, how do you cool it? For variations of cooling, but have a consistent API for bringing cooling into that IT asset. What it doesn't do is really dive into the rest of the physical infrastructure delivery. And that was very important in building a hardware spec, was that we didn't go over and above what we needed to consistently deliver hardware into a location. And when you do that, what you do is you allow for a tremendous amount of freedom on how you go and bring the rest of the infrastructure to the IT asset.So, in the same way when you build a software spec, you don't really concern yourself about what language you put in behind it, how the rest of that infrastructure, if you have like, a communication bus or is it like a semi API driven with a callback mechanism? You don't really try to think too heavily around that.You build the API and you expect the API to behave correctly. And so what that gave us the freedom to do is when we started bringing 48 volt power, we could then start thinking about the rest of the infrastructure a little bit differently when you bring consistent sets of APIs to cooling and to power. And so when we started thinking about it, we saw this trend line here about like. We knew that we needed to go think about 400 volt power. We saw the EV industry coming. There was a tread line towards 400 volt power delivery. What we did inside of that hardware spec was we left some optionality inside of the spec to go and change the way that we would go do work, right?So we gave some optional parameters the, infrastructure teams to go and change up what they needed to go do so that they could deliver that hardware, that infrastructure a little bit more carefully or correctly for their needs. So, we didn't over specify, in particular areas where, I'll give you a counter example and in other specifications out there you'll see like a very consistent busbar in the back of the infrastructure that delivers power. It's great when you're at aChris Adams: So if I can just stop for you for a second there, My. The busbar, that's the thing you plug a power thing instead of a socket. Is that what you're referring to there?My Truong: Oh, so, good question Chris. So in what you see in some of the Hyperscale rack at a time designs, you'llsee two copper bars sitting in the middle of the rack in the back delivering power. And that looks great for an at scale design pattern, but may not fit the needs of smaller or more nuanced design patterns that are out there. Does that make sense?Chris Adams: Yeah. Yeah. So instead of having a typical, kinda like three-way kind of kettle style plug, the servers just connect directly to this bar to provide the power. That's that's what one of those bars is. Yeah. Gotcha.My Truong: Yep. And so we went a slightly different way on that, where we had a dedicated power connection per device that went into the Open19 spec. And the spec is up, I think it's up still up on our ssia.org, website. And so anybody can go take a look at it and see the, mechanical spec there.It's a little bit different.Chris Adams: Okay. All right. So basically previously there was just a spec said "computers need to be this shape if they're gonna be server computers in rack." And then Open19 was a little bit more about saying, "okay, if you're gonna run all these at scale, then you should probably have some standards about how power goes in and how power goes out."Because if nothing else that allows 'em to be maybe some somewhat more efficient. And there's things like, and there's various considerations like that, that you can take into account. And you spoke about shifting from maybe 48 volts to like 400 volts and that there is efficiency gained by, which we probably don't need to go into too much detail about, when you do things like that because it allows you to use, maybe it allows you to move along more power without so much being wasted, for example.These are some of the things that the standards are looking into and well, in the last 10 years, we've seen a massive, we've seen a shift from data center racks, which use quite a lot of power to some things which use significantly more. So maybe 10 years ago you might had a cloud rack would be between five and 15 kilowatts of power.That's like, tens of homes. And now you we're looking at racks, which might be say, half a megawatt or a megawatt power, which is maybe hundreds if not thousands of homes worth of power. And therefore you need say, refreshed and updated standards. And that's where the V2 thing is moving towards.Right.My Truong: Okay.Chris Adams: Okay, cool. So yeah.Zachary Smith: Just, the hard thing about hardware standards, where the manufacturing supply chain moves slowunless you are end-to-end verticalizer, like some of the hyperscale customers can really verticalize. They build the data center, they build the hardware, lots of the same thing.They can force that. But a broader industry has to rely upon a supply chain. Maybe OEMs, third party data center operators, 'cause they don't build their own data center,they use somebody else's. And so what we accomplish with V2 was allow for this kind of innovation within the envelope and do the, one of our guiding principles was how could we provide the minimal amount of standardization that we would allow for more adoption to occur while still gaining the benefits?Chris Adams: Ah.Zachary Smith: And so that it's a really difficult friction point because your natural reaction is to like, solve the problem. Let's solve the problem as best we can.The that injects so much opinion that it's very hard to get adopted throughout the broader industry. And so even things like cooling,single phase or two phase, full immersion or not, this kind of liquid or this way, different types of pressure, whatever. There's all kinds of implications, whether those are technology use, those are regulatory situations across different environments, so I think like that's the challenge that we've had with hardware standards, is how to make it meaningful while still allowing it to evolve for specific use cases. Chris Adams: Alright. Okay. So, I think I am, I'm understanding a bit now. And like I'll try and put it in context to some of the other podcast episodes we've done. So we've had people come into this podcast from like Google for example, or Microsoft, and they talk about all these cool things that they're entirely vertically designed data centers where they're in the entire supply chain. They do all these cool things with the grid, right? But all those designs, a lot of the time they, there's maybe these might be custom designs in the case of Google when no one gets to see them. Or in some cases, like say Meta or some other ones, it may be open compute, which is a, it's a different size to most people's data centers, for example. So you can't just like drop that stuff in, like there's a few of them arouned, but it's still 19 inches that's the default standard in lots of places. And if I understand it, one of the things that, one of the goals of Open19 is to essentially bring everyone else along who already have standardized on these kind of sides so they can start doing some of the cool grid aware, carbon aware stuff that you see people talking about that you probably don't have that much access to if you're not already meta Google or Facebook with literally R&D budgets in the hundreds of millions.Zachary Smith: Yeah, maybe add some zeros there.Yeah, I think absolutely, right, which is democratizing access to some of this innovation, right? while still relying upon and helping within the broader supply chain. For example, if EVs are moving into 400 volt, like we can slipstream and bring that capability to data centre hardware supply chains.'Cause the people making power supplies or components or cabling are moving in those directions, right? But then it's also just allowing for the innovation, right? Like, I think, we firmly seen this in software. I think this is a great part of Linux Foundation, which is, no one company owns the, you know, monopoly on innovation. And what we really wanna see was not to like, can we make a better piece of hardware, but can we provide, some more foundational capabilities so that hundreds of startups or different types of organizations that might have different ideas or different needs or different goals could innovate around the sustainability aspect of data center hardware and, I think what we're focused on now within GSF is really taking that to a more foundational level. There's just a huge opportunity right now with the data center construction industry happening to really find a even more interesting place where we can take some of those learnings from hardware specifications and apply it to an even broader impact base Chris Adams: Ah, okay. Alright. I'll come back to some of this because there's, I know there's a project called Project Mycelium that Asim Hussain, the executive Director of the Green Software Foundation is continually talking about. But like we've spoken a little about, you mentioned, if I understand it, like, this allows you to maybe have more freedom to talk about maybe, instead of having like tiny fans, which scream at massive, thousands and thousands of RPM, there's other ways that you could maybe call down chips for example. And like, this is one thing that I know that the hardware standards working group is looking at, is finding ways to keep the servers cool, for example. Like as I understand it,using liquid is, can be more efficient, quite a bit more efficient than having tiny fans to cool at massive RPM to cool things down. But also, I guess there's a whole discussion about, well there's different kinds of, there's different ways of cooling things which might reduce the kind of local power draw, local water usage in a data center, for example.And like, maybe this is one thing we could talk a little bit more about then, 'cause I dunno that, we've had people talk about, say, liquid calling and things like that before, as like, these are some alternative ways to more sustainably cool down data centers in terms of how much power they need, but also what their local footprint could actually be.But we've never had people who actually have that much deep expertise in this. So maybe I could put the questions to one of you. Like, let's say you're gonna switch to liquid calling, for example, Instead of using itty bitty fans or even just bigger, slightly bigger fans, running a little bit slower. Like, how does that actually improve it? Maybe you could, maybe I could put this to you, My. 'Cause I think this is one thing that you've spent quite a lot of time looking into, like, yeah, where are the benefits? Like what, how does, how did the benefits materialize if you switch from, say, air to a liquid calling approach like this?My Truong: Yeah, so on the liquid cooling front, there's a number of pieces here. The fans that you were describing earlier, they're moving air, which is effectively a liquid when you're using it in a cooling mode at 25,000 RPM, you're trying to more air across the surface and it doesn't have a great amount of, Zachary Smith: Heat transfer capability. My Truong: removal and rejection. Yeah. heat transfer capabilities. Right. So in this world where we're not moving heat with air, we're moving it with some sort of liquid, either a single phase liquid, like water or a two-phase liquid taking advantage of two phase heat transfer properties.There's a lot of significant gains and those gains really start magnifying here in this AI space that we're in today. And I think this is where Project Mycelium started to come into fruition was to really think about that infrastructure end to end. When you're looking at some of these AI workloads, especially AI training workloads, their ability to go and move hundreds of megawatts of power simultaneously and instantaneously becomes a tricky cooling challenge and infrastructure challenge. And so really what we wanted to be able to think through is how do we go and allow software to signal all the way through into hardware and get hardware to help go and deal with this problem in a way that makes sense.So I'll give you a concrete example. If you're in the single phase space and you are in the 100 megawatt or 200 megawatt data center site, which is, this is what xAI built out Memphis, Tennessee. When you're going and swinging that workload, you are swinging a workload from zero to a hundred percent back to zero quite quickly. In the timescale of around 40 milliseconds or so, you can move a workload from zero to 200 megawatts back down to zero. When you're connected to a grid, when you're connected to a grid,Chris Adams: right.My Truong: that's called a grid distorting power event, right?You can go swing an entire grid 200 megawatt, which is, probably like, maybe like a quarter of the LA area of like the ability to go and distort a grid pretty quickly. When you're an isolated grid like Ercot, this becomes like a very, tricky proposition for the grid to go and manage correctly. On the flip side of that, like once you took the power you, created about 200 megawatt of heat as well. And when you start doing that, you have to really think about what are you doing on your cooling infrastructure. If you're a pump based system, like single phase, that means that you're probably having to spool up and spool down your pump system quite rapidly to go respond to that swing in power demand. But how do you know? How do you prep the system? How do you tell that this is going to happen? And this is where we really need to start thinking about, these software hardware interfaces. Wouldn't it be great if your software workload could start signaling to your software or your hardware infrastructure? "Hey I'm about, to go and start up this workload, and I'm about to go and swing this workload quite quickly." You would really want to go signal to your infrastructure and say, "yes, I'm about to go do this to you," and maybe you want to even signal to your grid, "I'm about to go do this for you" as well. You can start thinking about other, options for managing your power systems correctly, maybe using like a battery system to go and shave off that peak inside of the grid and manage that appropriately. So we can start thinking about this. Once we have this ability to go signal from software to hardware to infrastructure and building that communication path, it becomes an interesting thought exercise that we can realize that this is just a software problem.have been in this hardware, software space, we've seen this before. And is it worth synchronizing this data up? Is it worth signaling this correctly through the infrastructure? This is like the big question that we have with Project Mycelium. Like, it would be amazing for us to be able to do this.Chris Adams: Ah, I see.My Truong: The secondary effects of this is to really go think through, now, if you're in Dublin where you have offshore power and you now have one hour resolution on data that's coming through about the amount of green power that's about to come through, it would be amazing for you to signal up and signal down your infrastructure to say, you should really spool up your workload and maybe run it at 150% for a while, right?This would be a great time to go really take green power off grid and drive your workload on green power for this duration. And then as that power spools off, you can go roll that power need off for a time window. So being able to think about these things that we can create between the software hardware interface is really where I think that we have this opportunity to really make game changing and really economy changing outcomes. Chris Adams: Okay. Zachary Smith: I have a viewpoint on that, Chris, Chris Adams: too.Yeah, please do.Zachary Smith: My TLDR summary is like, infrastructure has gotten much more complicated and the interplay between workload and that physical infrastructure is no longer, "set it in there and just forget it and the fans will blow and the servers will work and nobody will notice the difference in the IT room."These are incredibly complex workloads. Significant amount of our world is interacting with this type of infrastructure through software services. It's just got more complicated, right? And what we haven't really done is provide more efficient and advanced ways to collaborate between that infrastructure and the kind of workload. It's, still working under some paradigms that like, data centers, you put everything in there and the computers just run. And that's just not the case anymore. Right. I think that's what My was illustrating so nicely, is that workload is so different and so dynamic and so complex that we need to step up with some ways to, for the infrastructure and that software workload to communicate.Chris Adams: Ah, I see. Okay. So I'll try and translate some of that for some of the listeners that we've had here. So you said something about, okay. A 200 megawatt like power swing, that's like, that's not that far away from a half a million people appearing on the grid, then disappearing on the grid every 14 milliseconds.And like obviously that's gonna piss off people who have to operate the grid. But that by itself is one thing, and that's also a change from what we had before because typically cloud data centers were known for being good customers because they're really like flat, predictable power draw.And now rather than having like a flat kind of line, you have something more like a kind of seesaw, a saw tooth, like up, down, up, down, up, down, up, down. And like if you just pass that straight through to the grid, that's a really good way to just like totally mess with the grid and do all kinds of damage to the rest of the grid.But what it sounds like you're saying is actually, if you have some degree of control within the data center, you might say, "well, all this crazy spikiness, rather than pulling it from the grid, can I pull it from batteries, for example?" And then I might expose, or that I might expose that familiar flat pattern to the rest of the grid, for example.And that might be a way to make you more popular with grid operators, but also that might be a way to actually make the system more efficient. So that's one of the things you said there. So that's one kind of helpful thing there. But also you said that there is a chance to like dynamically scale up how, when there is loads and loads of green energy, so you end up turning into a bit more of a kind of like better neighbor on the grid essentially.And that can have implications for the fact that because we are moving to a, like you said before, there's complexity at the power level and it allows the data centers to rather than make that worse, that gonna address some of those things. So it's complimentary to the grid, is that what you're saying?My Truong: Yeah. I think you got it, Chris. You got it, Chris. Yeah.Exactly. So that's on the power side. I think that we have this other opportunity now that as we're starting to introduce liquid cooling to the space as well, we're effectively, efficeintly removing heat from the silicon. Especially in Europethis is becoming like a very front and center, conversation of data centers operating in Europe is that this energy doesn't need to go to waste and be evacuated into the atmosphere. We have this tremendous opportunity to go and bring that heat into local municipal heat loopsand really think about that much more, in a much more cohesive way. And so this is again, like where we really, like, as Zach was saying, we really need to think about this a bit comprehensively and really rethink our architectures to some degree with these types of workloads coming through. And so bringing standards around the hardware, the software interface, and then as we start thinking through the rest of the ecosystem, how do we think through bringing consistency to some of this interface so that we can communicate "workload is going up, workload is going down. The city needs x amount of gigawatt of power into a municipal heat loop," like help the entire ecosystem out a little bit better. In the winter, probably Berlin or Frankfurt would be excited to have gigawatts of power in a heat loop to go and drive a carbon free heating footprint inside of the area. But then on the flip side of that, going and building a site that expects that in the winter, but in the summer where you're not able to take that heat off, how do we think about more innovative ways of driving cooling then as well? How do we go and use that heat in a more effective way to drive a cooling infrastructure?Chris Adams: So, okay, so this is that. I'm glad you mentioned the example, 'cause I live in Germany and our biggest driver of fossil fuel use is heating things up when it gets cold. So that's one of the good, good ways to, like, if, there's a way to actually use heat, which doesn't involve burning more fossil fuels, totally. Or I'm all for that. There is actually, one question I might ask actually is like, what are the coolants that people use for this kind of stuff? Because the, when you, I mean, when we move away from air, you're not norm, you're not typically just using water in some, all of these cases, there may be different kinds of chemicals or different kinds of coolants in use, right?I mean, maybe you could talk a little bit about that, because I know that we had switches from when we've looked at how we use coolant elsewhere, there's been different generations of coolants for our, and in Europe, I know one thing we, there's a whole ongoing discussion about saying, "okay, if we're gonna have liquid cooling, can we at least make sure that the liquid, the coolants we're using are actually not the things which end up being massively emitting in their own right," because one of the big drivers of emissions is like end of life refrigerants and things like that. Maybe you could talk a little bit about like what your options are if you're gonna do liquid cooling and like, what's on the table right now?To actually do something which is more efficient, but is also a bit more kind of non-toxic and safe if you're gonna have this inside a, in inside a given space.My Truong: Yeah. So in liquid cooling there's a number of fluids that we can use. the most well understood of the fluids, as used both in the facility and the technical loop side is standard de-ionized water. Just water across the cold plate. There's variations that are used out there with a propylene glycol mix to manage microbial growth. The organization that I'm part of, we use a two-phase approach where we're taking a two-phase fluid, and taking advantage of phase change to remove that heat from the silicon. And in this space, this is where we have a lot of conversations around fluids and fluid safety and how we're thinking about that fluid and end of life usage of that fluid. Once you're removing heat with that fluid and putting it into a network, most of these heat networks are water-based heat networks where you're using some sort of water with a microbial treatment and going through treatment regimes to manage that water quality through the system.So this is a very conventional approach. Overall, there's good and bads to every system. Water is very good at removing heat from systems. But as you start getting towards megawatt scale, the size of plumbing that you're requiring to go remove that heat and bring that fluid through, becomes a real technical challenge.And alsoat megawatts. Yeah. Yeah.Zachary Smith: If I'm not mistaken.Also, there's challenges if you're not doing a two-phase, approach to actually removing heat at a hot enough temperature that you can use it for something else, right?My Truong: Correct. Correct, Zach. It's, so there's, like, a number of like very like technical angles to this. So as you're, going down that path, Zach, so in single phase what we do is we have to move fluid across that surface a good enough clip to make sure that we're removing heat keeping that silicon from overheating. Downside of this is like, as silicon requires colder and colder temperatures to keep them operating well, their opportunity to drive that heat source up high enough to be able to use in a municipal heat loop becomes lower and lower. So let's say, for example, your best in class silicon today asking for what's known as a 65 degree TJ. That's a number that we see in the silicon side. So you're basically saying, "I need my silicon to be 65 degrees Celsius or lower to be able to operate properly." flip side of that is you're gonna ask your infrastructure go deliver water between 12 to 17 degrees Celsius to make sure that, that cooling is supplied. But the flip side of that is that if you allow for, let's say, a 20 degree Celsius rise, your exit temperature on that water is only gonna 20 degrees higher than the 70 degrees inlet, so that water temperature is so lowAnd that's not a very nice shower, basically. Yeah.You're in a lukewarm shower at best.So, we have to do, then we have to tr spend a tremendous amount of energy then bring that heat quality up so that we can use it in a heat network. And two phase approaches, what we're taking advantage of is the physics of two-phase heat transfer, where, during phase change, you have exactly one temperature, which that fluid will phase change.To a gas. Yeah. Yeah.To a gas. Exactly.Yeah.And so the easiest way, like, we'll use the water example, but this is not typically what's used in two phase tech technologies, is that water at a atmospheric pressure will always phase change about a hundred degrees Celsius. It's not 101, it's not 99. It's always a hundred degrees Celsius at hemispheric pressure. So your silicon underneath that will always be at a, around a hundred degree Celsius or maybe a little bit higher depending on what your heat transfer, characteristics look like. And this is the physics that we take advantage of. So when you're doing that, the vapor side this becomes like a very valuable energy source and you can actually do some very creative things with it on two phase.So that's, there's some, every technology has a, is a double-edged sword and we're taking advantage of the physics of heat transfer to effectively and efficiently remove heat in two-phase solutions.Chris Adams: Ah, so I have one kind of question about the actual, how that changes what data center feels like to be inside, because I've been inside data centers and they are not quiet places to be. Like, I couldn't believe just how uncomfortably loud they are. And like, if you're moving away from fans, does that change how they sound, for example?Because if, even if you're outside some buildings, people talk about some of the noise pollution aspects. Does a move to something like this mean that it changes some of it at all?Zachary Smith: Oh yeah.My Truong: In inside of the white space. Absolutely. Like one of the things that we fear the most inside of a data center is dead silence.You might actually be able to end up in a data center where there's dead silence, soon.And that being a good thing. Yeah.With no fans. Yeah. We'd love to remove the parasitic draw of power from fans moving air across data centers, just to allow that power to go back into the workload itself.Chris Adams: So for context, maybe someone, if you haven't been in a data center... I mean, it was around, I think it felt like 80 to 90 decibels for me, which felt like a, I mean, defects have aYeah, plus could have been more actually. Yeah. So I mean, it was a, I mean, if you have like an, if you have a something on a wearable, on a phone, as soon as it's above 90 degrees, 90 decibels, that's likelouder than lots of nightclubs, basically. Like maybe there's a comp. So this is one thing that I fell and this sounds like it does, like it can introduce some changes there as well rather than actually just, we're just talking about energy and water usage. Right.Zachary Smith: Yeah, most data center technicians wear ear protectors all the time, can't talk on the phone, have to scream at each other, because it's so loud. Certainly there's, some really nice quality of life improvements that can happen when you're not blowing that much air around and spinning up multiple thousandMy Truong: 25,000 to 30,000 RPM fans will, require you double hearing protection to be able to even function as out of the space.Yeah, that's the thing.A lot of energy there.Chris Adams: Oh, okay. Cool. So, so this is the, these are some of the, this is some of the shifts that make possible. So the idea, you can have, you might have data centers of what you're able to be more active in terms of actually working with the grid because for all the kind of things we might do as software engineers, there's actually a standard which makes sure that the things that we see Google doing or Meta talking about in academic papers could be more accessible to more people.That's one of the things that having standards and for like Open19 things might be because there's just so many more people using 19 inch racks and things like that. That seems to be one thing. So maybe I could actually ask you folks like. This is one thing that you've been working on and My, you obviously running an organization, Zuta Core here, and Zach, it sounds like you're working on a number of these projects.Are there any particular like open source projects or papers or things with really some of these. Some of the more wacky ideas or more interesting projects that you would point people to? Because when I talk about data centers and things like this, there's a paper from, that's called the Ecoviser paper, which is all about virtualizing power so that you could have power from batteries going to certain workloads and power from the grid going to other workloads.And we've always thought about it as going one way, but it sounds like with things like Project Mycelium, you can go have things going the other way. Like for people who are really into this stuff, are there any, are there any good repos that you would point people to? Or is there a particular paper that you found exciting that you would direct people to who are still with us and still being able to keep up with the kind of, honestly, quite technical discussion we've had here.?Zachary Smith: Well, I would, not to tout My's horn, but, reading the Open19 V2 specification, I think is worthwhile. Some of the challenges we dealt with at a kind of server and rack level, I think are indicative of where the market is and where it's going. There's also great stuff within the OCP Advanced Cooling working group. And I found it very interesting, especially to see some of what's coming from Hyperscale where they are able to move faster through a verticalized integration approach. And then I've just been really interested in following along the power systems, and related from the EV industry, I think there's, that's an exciting area where we can start to see data centers not as buildings for IT, but data centers as energy components.So when you're looking at, whether it's EV or grid scale kind of renewable management, I think there's some really interesting tie-ins that our industry, frankly is not very good at yet.Ah.Most people who are working in data centers are not actually power experts from a generation or storage perspective.And so there's some just educational opportunities there. I've found, just as one resource, My, I don't know if they have it, at the, the seven by 24 conference group, which is the critical infrastructure conference, which everything from like water systems, power systems to data centers, has been really a great learning place for me.But I'm not sure if they have a publication that is useful. We, have some work to do in moving our industry into transparent Git repos.My Truong: Chris, my favorite is actually the open BMC codebase. It provides a tremendous gateway where this used to be a very closed ecosystem, and very hard for us to think about being able to look through a code repo of a redfish API, and able to rev that spec in a way that could be useful and, implementable into an ecosystem has been like my favorite place outside of hardware specifications likeChris Adams: Ah, okay. So I might try and translate that 'cause I, the BMC thing, this is basically the bit of computing, which essentially tells software what's going on inside of data, how much power it's using and stuff like that. Is that what you're referring to? And is Open BMC, like something used to be proprietary, there is now a more open standard so that there's a visibility that wasn't there before.Is that what it is? My Truong: Right. that's exactly right. So there you have to, in years past, had a closed ecosystem on the service controller or the BMC, the baseboard controller dule inside of a server and being able to look into that code base was always very difficult at best and traumatic at worst. But having open BMC reference code out there,being look and see an implementation and port that code base into running systems has been very useful, I think, for the ecosystem to go and get more transparency, as Zach was saying, into API driven interfaces.oh.What I'm seeing is that prevalence of that code base now showing up in a, number of different places and the patterns are being designated into, as Zach was saying, power systems. We're seeing this, become more and more prevalent in power shelves, power control, places where we used to not have access or we used to use programmable logic controllers to drive this. They're now becoming much more software ecosystem driven and opening up a lot form possibilities for us.Chris Adams: Okay. I'm now understanding the whole idea behind Mycelium, like roots reaching down further down into the actual hardware to do things that couldn't be done before that. Okay. This now makes a lot more sense. Yeah.Peel it back. One more layer.Okay. Stacks within Stacks. Brilliant. Okay. This makes sense. Okay folks, well, thank you very much for actually sharing that and diving into those other projects.We'll add some, if we can, we'll add some links to some of those things. 'Cause I think the open BMC, that's one thing that is actually in production in a few places. I know that Oxide Computer use some of this, but there's other providers who also have that as part of their stack now that you can see.Right.My Truong: We also put into production when we were part of the Packet Equinix team. So we have a little bit of experience in running this tech base in, real production workloads.Chris Adams: Oh wow. I might ask you some questions outside this podcast 'cause this is one thing that we always struggle with is finding who's actually exposing any of these numbers for people who are actually further up the stack because it's a real challenge. Alright. Okay, we're coming up to time, so, I just wanna leave one question with you folks, if I may.If people have found this interesting and they want to like, follow what's going on with Zach Smith and My Truong, where do they look? Where do they go? Like, can you just give us some pointers about where we should be following and what we should be linking to in the show notes? 'Cause I think there's quite a lot of stuff we've covered here and I think there's space for a lot more learning actually.Zachary Smith: Well, I can't say I'm using X or related on a constant basis, but I'm on LinkedIn @zsmith, connect with me there. Follow. I post occasionally on working groups and other parts that I'm part of. And I'd encourage, if folks are interested, like we're very early in this hardware working group within the GSF.There's so much opportunity. We need more help. We need more ideas. We need more places to try. And so if you're interested, I'd suggest joining or coming to some of our working group sessions. It's very early and we're open to all kinds of ideas as long as you're willing to, copy a core value from Equinix,as long as you can speak up and then step up, we'd love the help. there's a lot to do.Brilliant, Zach. And My, over to you.My Truong: LinkedIn as well. Love to see people here as part of our working groups, and see what we can move forward here in the industry.Chris Adams: Brilliant. Okay. Well, gentlemen, thank you so much for taking me through this tour all the way down the stack into the depths that we as software developers don't really have that much visibility into. And I hope you have a lovely morning slash day slash afternoon depending on where you are in the world.Alright, cheers fellas.Thanks Chris.Thanks so much.  Hey everyone, thanks for listening. Just a reminder to follow Environment Variables on Apple Podcasts, Spotify, or wherever you get your podcasts. And please do leave a rating and review if you like what we're doing. It helps other people discover the show, and of course, we'd love to have more listeners.To find out more about the Green Software Foundation, please visit greensoftware.foundation. That's greensoftware.foundation in any browser. Thanks again, and see you in the next episode. 
undefined
May 8, 2025 • 55min

Cloud Infrastructure, Efficiency and Sustainability

Host Anne Currie is Joined by the esteemed Charles Humble, a figure in the world of sustainable technology. Charles Humble is a writer, podcaster, and former CTO with a decade’s experience helping technologists build better systems—both technically and ethically. Together, they discuss how developers and companies can make smarter, greener choices in the cloud, as well as the trade-offs that should be considered. They discuss the road that led to the present state of generative AI, the effect it has had on the planet, as well as their hopes for a more sustainable future.Learn more about our people:Anne Currie: LinkedIn | WebsiteCharles Humble: LinkedIn | WebsiteFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterNews:The Developer's Guide to Cloud Infrastructure, Efficiency and Sustainability | Charles Humble [01:13] Charles Humble on O'Riley [01:50] Building Green Software [Book] [02:09]Twofish Music [48:03]Resources:User Interface Design For Programmers – Joel Spolsky [12:03] Environment Variables Episode 100: TWiGS: Sustainable AI Progress w/ Holly Cummins [18:12] Green Software Maturity Matrix [19:09] Writing Greener Software Even When You Are Stuck On-Prem • Charles Humble • GOTO 2024 [23:42]Electricity Maps [23:57]Cloud Carbon Footprint [36:52] Software Carbon Intensity (SCI) Specification | GSF [37:06]ML.energy [38:31]Perseus (SOSP '24) - Zeus Project | Jae-Won Chung [41:26] If you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:Charles Humble: In general, if you are working with vendors, whether they're AI vendors or whatever, it is entirely reasonable to go and say, "well, I want to know what your carbon story looks like." And if they won't tell you, go somewhere else. Chris Adams: Hello, and welcome to Environment Variables, brought to you by the Green Software Foundation. In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software.I'm your host, Chris Adams. Anne Currie: Hello and welcome to Environment Variables, where we bring you the latest news and updates from the world of sustainable software development. Today I'm your guest host Anne Currie, and we'll be zooming in on an increasingly important topic, cloud infrastructure, efficiency and sustainability.Using the cloud well is about making some really clever choices, really difficult choices upfront. And they have an enormous, those choices an enormous impact on our carbon footprint, but we often just don't make them. So our guest today is someone who's thought very deeply about this.So Charles Humble is a writer, podcaster, and former CTO who has spent the past decade helping technologists build better systems, both technically and ethically. He's the author of The Developer's Guide to Cloud Infrastructure, Efficiency and Sustainability, a book that breaks down how cloud choices intersects with environmental impacts and performance.So before we go on, Charles, please introduce yourself.Charles Humble: Thank you. Yes, so as you said, I'm Charles Humble. I work mainly as a consultant and also an author and a technologist. I have a, my own business is a company called Conissaunce, which I run. And I'm very excited to be here. I speak a lot at conferences, most recently, mainly about sustainability. I've written a bunch of stuff with O'Reilly, including a series of shortcut articles called Professional Skills for Software Engineers, and as you mentioned most recently, this ebook, which I think is why you've invited me on.Anne Currie: It is indeed. Yes. So, to introduce myself, my name is Anne Currie. I've been in the tech industry for pretty a long time. Pretty much the same as Charles, about 30 years. And I am one of the authors of O'Reilly's new book, building Green Software, which is entirely and completely aimed at the folks who will be listening to this podcast today.So if you haven't listened to it, if you haven't read it or listened to it because it is available in an audio version as well, then please do so, you'd enjoy it. So, let's get on with the questions that we want to ask about today. So, Charles, you've written this great ebook, which is also something everybody who's listening to the podcast should be reading.And we'll link to it in the show notes below. In fact, everything we'll be talking about today will be linked to in the show notes below. But let's start with one of the key insights from your book, which is that choices matter. Things like VM choices matter, but they're often overlooked when it comes to planning your cloud infrastructure.What did you learn about that? What do you feel about that, Charles? Charles Humble: it's such an interesting place to start. So I think, when I was thinking about this book and how I was putting it together, my kind of starting point was, I wanted like a really easy on-ramp for people. And that came from, you know, speaking a lot at conferences and through some of the consulting work I've done and having people come up to me and say, "well, I kind of want to do the right thing, but I'm not very clear what the right thing is." And I think one of the things that's happened, we've been very good about talking about some of the carbon aware competing stuff, you know, demand shifting and shaping and those sorts of things. But that's quite a, quite an ambitious place to start. And oftentimes there are so many kind of easier wins, I think. And I kind of feel like I want to get us talking a little bit more about some of the easy stuff. 'Cause it's stuff that we can just do. The other thing is, you know, human beings, we make assumptions and we learn things and then we don't go back and reexamine those things later on. So I've occasionally thought to myself, I ought to write a work called something like Things That Were True But Aren't Anymore or something like that because we all have these things. Like my mental model of how a CPU works until probably about two years ago is basically a Pentium two .And CPUs haven't looked like a Pentium two for a very long time, and I have a feeling I'm not the only one. So, you were specifically asking about like CPUs and VM choices, and I think a lot of the time, those of us, certainly those of us of a certain age, but I don't think it's just us, came through this era where Windows and Intel were totally dominant. And so we naturally default to well, "Intel will be fine"because it was right for a long time.Anne Currie: Yeah.Intel Charles Humble: was the right Anne Currie: Who could ever have imagined that Intel would lose the data center? It's Charles Humble: Absolutely it is extraordinary. I mean obviously they lost mobile mainly to ARM and that was very much a sort of power efficiency thing. Fair enough. But yes, the idea that they might be losing the data center or might have lost the data center is extraordinary. But you know, the reality is first of all, if you are thinking about running your workloads. So, AMD processors, more or less how a cross compatible of Intel wants. It's not totally true, but it kind of is. So they have an X86 compatible instruction set. So for the most part, your workloads that will run on Intel will run on AMD.But not only will they run on AMD, they will probably run on AMD better.Again, for the most part, there are places where Intel probably has an edge, I would think. If you're doing a lot of floating point maths, then, maybe they still have an edge. I'm not a hundred percent sure, but as a rule of thumb, AMD is going to be, you know, faster and cheaper. And the reason for that has a great deal to do with core density. So AMD has more cores per chip than Intel does, and what that means is you end up with more processing per server, which means you need fewer servers to run the same workload. I ran some tests for the ebook and that came out,so I had a 2000 VM instance and we had 11 AMD powered servers. So running, epic, the AMD Epic chips and we needed 17 Intel powered servers to do the same job. Right? So that's roughly 35% fewer servers. It's not, by the way, 35% less power use. It's actually about 29%, something like that, less power use 'cause the chips are quite power hungry, but still that's a big saving, right? And it's also, by the way, a cost saving as well. So the other part of this is, you know, it is probably about 13% cheaper to be running your workload on AMD than Intel. Now obviously your mileage may vary and you need to verify everything I'm saying.Don't just assume, "well, Charles Humble said it's true, so it must be." It'll be a foolish thing to do, but as a rule of fault, the chances are in most cases you're better off and I'll wager that you are a lot of the time when you are setting up your VMs on your cloud provider, your cloud providers probably default to Intel and you probably just think, "well, that'll be fine."Right?So kind of a case of trying to flip that script. So maybe you default to AMD, maybe you evaluate whether ARM processors will work. We are seeing another surge of ARM in datacenters. Though, as I said, that comes with some it. In mobile, the trade offs are pretty straightforward with ARM to anything else. In data centers it is a little bit more nuanced. But basically it's that, and I think it's, I think it's this thing of, as I say, of these assumptions that we've just built up over time that we don't, we're not very good at going back and reexamining our opinions or our assumptions. And then the other thing that I think feeds into this is we build layers of abstractions, right? That's what computer science does, and we get more and more abstracted away from what the actual hardware is doing. I found myself this morning when I was thinking about coming on the show, thinking a bit about some of the stuff Martin Thompson's been talking about for years, about mechanical sympathy.I'm sure you have experiences of this, and I know I have,where, you know, I've been brought into a company that's having performance problems. And you look at, there's one that I actually remember vividly from decades ago, but it was, an internet banking app. So it was a new internet bank that was written in visual basic, weird choice, but anyway, go with me here. And they were reading. It was all MQ series, so IBM MQ series under the hood, right? So basically you've got messages that were written in XML being passed around between little programs. It looks a bit like microservices, but 20 years ago before we had the term roughly. And what they were doing, so when you read a message off an MQQ, you read it off essentially one byte at a time.And what they were doing in a loop in Visual Basic was they were basically saying string equals string plus next byte. Does that make sense? So, string equals string plus new string. That kind of idea. Now under the cover, they're doing a deep string copy every single time they do that. But they had no idea 'cause they were visual basic programmers and didn't know what a deep string copy even was.Fair enough. And then they were going, "why is our audit process grinding to a halt?"And the reason is, well, 'cause you'll, we just need like an API. But what I'm getting at is we have these, we get very abstracted away from what the hardware is doingbecause most of the time that's fine, right?That's what we want, except that our abstractions leak in weird ways. And so sometimes you kind of need to be able to draw on this is what's actually happening to understand. So as I say, in the case of,in the case of CPUs, if you haven't been paying attention to CPUs for a while, you probably think Intel still has the edge, but right now, sorry, Intel, they don't.Hope that changes. Competition is always good. But you know, it's just a great example of, you probably don't even think about it. You probably haven't thought about it for years. I know, honestly I hadn't.Anne Currie: Yeah.Charles Humble: But then you start running these numbers and go, "gosh, that's, you know, like a 30% power saving."That's, at any sort of scale, that's quite a big deal. And so a lot of the things that I was trying to do in the book was really that. It was just saying, well, what are some of the things that we can do that are easy things,that make a massive difference?Anne Currie: It's interesting. What you're saying there reminds me a little bit of somebody who was a big name in tech back in our early, you'll remember it very well. Joel Spolsky used to write a thing about, you know, what would Joel do? He used to work on, do a lot of work on usability, studying usability.And he'd say, well, you're not looking for, to change the world and rewrite all these systems. You are often just looking for the truffles, the small changes that will have an outsize effect. And what you're saying is that, for example, moving from Intel to AMD is a small truffle that will have an outsized effect. If you do it at the right time,it's, actually you could probably, it's not so much, as you say, the trouble with go with going to an ARM ship or, you know, Graviton servers that's been pushed very heavily by AWS at the moment. Big improvement in energy use and reductions in cost. But that is not a lift, that's not an instantoh, flick of switch and you go over. They, you know, there are services that are no longer available. There are, you know, you're gonna have to retest and recompile and do all the things, but it's not such an obvious truffle. But you are saying that really that the intel AMD might be a really easy win for you.Charles Humble: Yeah, absolutely. Absolutely. It's funny you mentioned Joel Spolsky there. 'Cause actually his, so I read his User Interface Design for Programmers, I think the book is called, about 30 years ago probably. It's just, I still, like everything I know about user interface, I swear it comes from that like book.It was such a brilliant, it's also hysterically funny. It has all sorts of examples of just, it's very wittily written and has some wonderful examples of, you know, just terrible bits of user interface. Like the Windows 95 start button, which is in the bottom left hand corner. Except that if you drag to the bottom left hand corner of the screen, which is one of the easy places on a screen to hit, you miss the start button because aesthetically it looked wrong without a border around it.But then no one thought, well, maybe we should just make it so if you miss, but you are there, you know, like it's full of just examples like that. It's very funny. And yeah, absolutely. This, business of, as I say, so much of, we have as an industry, been very profligate, right? We've been quite casual about our energy use and ourhardware use. So there's another example, which is to do with infrastructure and right sizing.Again, this is just one of those things, it's such an easy, quick win for peopleand it's another thing that connects to this business of our old assumptions. So when I started in the industry, and probably when you started in the industry and we ran everything in our own data centers, procurement was very slow, right?If I needed a new server, I probably had to fill in a form and 10 people had to sign it, and then it would go off to procurement and it would sit doing, heaven knows what for a couple of months, and then eventually someone might get around to buying a server and then they'd install the software on it and then it would get racked.And you know, like six months of my life could have gone by, right.And so what that meant was if I was putting a new app in, and at some point someone would come along to you and go, "we're putting this new app in. How many servers do you need?" And what you do is you'd run a bunch of load tests on, I dunno, load runner or something like that.You'd work out what the maximum possible concurrent, like, oh, sorry, concurrent was a poor choice of word there.Simultaneous number of users on your system, rather.Yeah.Right. You simulate that loads, that would tell you how many boxes you needed. So suppose that said four servers, you go to procurement and you go "eight, please."Anne Currie: Indeed. Charles Humble: Right. And no one would ever say "why do you need eight?" Because, right. And that's just. That's just what we do. And what's weird is we still do it, right. Even though elastic compute on the cloud means surely we don't need to. We kind of have this mindset of, "well, I'll just, I'll add a bit more just to be on the safe side 'cause I'm nottoo confident about my numbers.Anne Currie: There is a logic to it if it's easy because it, the thing that you fear is that you'll under provisioning and it'll fall over. So there's a big risk to that. Over provisioning, yes, it cost you more, but it's hard. It's really hard to get the provisioning perfect.So we over provision and then you always intend to come back later and right size. And of course you never do because you never get a chance to come back and do things later. Charles Humble: Something I say a lot to the companies that I consult to is "well just run an audit."Anne Currie: Yes, indeed. Yeah.Have Charles Humble: a three month process or a, you know, like a three month or a six month mission where we are gonna do a right sizing exercise. We're gonna look for zombie machines. So those are machines that were, you know, once doing something useful but are doing nothing useful anymore. And also look for machines that are just sitting idle and get rid of them. You actually have an amazing story in the, in your O'Reilly book, the Building Green Software book from Martin Lippert. So he was tools and lead sustainability for VMware, Broadcom, part of the old Spring team.He talks about, so in 2019, I think it was in VMware, they consolidated a datacenter in Singapore. They were moving the data center and basically they found that something like 66% of all the host machines were zombies. 66%.Yeah. And that's untypical. Anne Currie: No, it's not.Charles Humble: I've gone and done audits. 50% plus is quite normal. So I have this like thing that I quite often say to people, I reckon you can halve your carbon emissionsin your IT practice just by running an audit and getting rid of things you don't need. And it may even be more than that. Anne Currie: Yeah, indeed. As VMware discovered, and people do it at a time when they move data centers. I often think this is probably a major reason why when people go, "oh, you know, I repatriated, I moved away from the cloud back in and I saved a whole load of money."Yeah, you would've made, saved that money doing that kind of exercise in the cloud as well. Probably more because the cloud, the trouble with the cloud is both amazing, it has amazing potential for efficiency because it has great servers that are written to be very efficient and you wouldn't be able to write them that efficiently yourselves.So there's amazing potential. Spot instances, burstable instance types, serverless, you know, there's loads of services that can really help you be efficient. But it's so easy to overprovision that inevitably everybody over provisions massively. And especially if you lift and shift into the cloud, you massively over provision.Charles Humble: There's a related thing there as well because it's so easy toand then you just forget about it. Evevn on my own, like sort of, you know, personal projects, I've suddenly got a bill from Google or something and I've been like, "oh hello, that then?"And you know, it's something that I spun up three months ago for an article I was writing or something and I'd just totally forgotten about. And it's been sitting there running ever since, you know, like, and you could imagine how much worse that is as an enterprise, this is just like me on my own doing it.And it's that kind of thing. I think. So thinking about things like auto sizing, you know,scaling up remembering, to scale back down again. People often scale up and don't scale down again. There's some of the Holly Cummings stuff around Lightswith Ops. This idea of, you know, basically you want to be able to spin your systems back up again really easily.That sort of stuff. Again, this is all stuff that's quite easy to do, relatively speaking.Anne Currie: Relatively. So much easier than rewriting your systems in Rust or C, I can assure you of that.Charles Humble: Well, a hundred percent, right? And, again, you know, I've made this, I've made this joke a few times on stage and it's absolutely true. We kind of, because we're programmers, we automatically think, "oh, I'll go and look at a benchmark that tells me what the most efficient language is," and it will be C or C++ or something.And like "we will rewrite everything in C or C++ or Rust." Well that would be insane. And your company would go bust and nobody is gonna sponsor you to do that for very good reason. Andwhat you want to be doing is you want to be saying, "well, you know, what are the pragmatic things we can do that will make a huge difference?"And a lot of those things are. You know, rightsizing. It's a really good example. Anne Currie: Yeah, I mean, I clearly we're, this is something that you and I have discussed many times and it was one of the reasons why at the end of Building Green Software, we devised the Green Software Maturity Matrix that we donated to the Green Software Foundation, Charles Humble: Yes. Anne Currie: because the, what we found over and over again when we talked to conferences, went out and spoke to people is that they had a tendency to leap right to the end, rewrite things in.You know, they say, "well, we couldn't rewrite everything in C or Rust or we'd go outta business, so we won't do anything at all." And they step over all the most important, they step over all the truffles, which are switching your CPU choice, switching your VM choice, doing a right sizing, audits, doing a basic audit of your systems and turning off stuff, doing a security audit because a lot of the, these zombie systems actually should be turned off in a security audit because if they're there and they're running and they're not being patched and nobody owns them anymore, nobody knows what they're doing anymore, they will get hacked.They are the ways into your system. So sometimes the way to pitch this is a security audit.Charles Humble: Absolutely. Yes, and I do, I use the Maturity Matrix quite a lot in this ebook. Actually, it's one of the things that I reference all the way through it for exactly this reason, because it's, as I said, I think we tend to go to the end a lot. And actually a lot of the stuff is so much earlier on than that.And I think it's just a, yeah, I think it's a really important thing to realize that there's a huge amount you can do. And actually as well, it's gonna save you an awful lot of money. And given the kind of very uncertain business environment that we're in, and people are very kind of worried about investing at the moment for all sorts of quite sensible reasons, this is one of those moments where actually if you're thinking about "I want to get my business onto a more, or my IT within my company onto a more sustainable footing," this is absolutely the right time to be having those conversations with your CFO, with your execs because, you know, this is the time where businesses need to be thinking, "well, how do I cut cost?" And there's a huge amount of waste. I guarantee you if you've not looked at this, there will be a huge amount of waste in your IT you can just get ofand be a bit of a hero and, you know, do good by the planet at the same time.It's like, what's not to like?Anne Currie: Yeah, because I mean, different companies, different enterprises, different entities have different roles in the energy transition. For most enterprises, your role is to adopt modern DevOps practices really, it's a new start. You don't mean you don't have to start there. You can start with the, as you say, manual audit.Sometimes I've heard it called the thriftathon, where you just go through and you go, "do you know that machine? Turn it off." You know, you can use that kind of, they use the screen test method of "you don't think anyone's using it, turn it off. Find out if anybody was using it." And then you can use that to kind of step yourself up to the next level.You and I both know holly Cummins, who was a guest, cut two back, one back, on this podcast. And she introduced the idea of, Lightswitch Ops, which is the, first kind of automation. If you haven't done any automation up till now and you want to learn how to do automation, a really good bit of automation is the ability to turn machines off automatically, maybe for a period overnight or, and you try that out on machines like your test suites, to just get yourself into the, to the simplest form of automation. It can also, if you are on the right, it depends if you're on the right models and you're in the cloud potentially, or you have the rightinfrastructure, then that can save you money. It might not always save you money because you have to have made the right infrastructure choices. It might just that be that the machine sits on and doesn't really do anything. You've just turned off your application. But you really want to be turning things off to save power.You know, and it's a really good way of getting you into the DevOps mindset, which is where everybody needs to be with so many payoffs.Charles Humble: Yes.Anne Currie: But yes. So, we'll go back to, do ask the questions. So, in part of, in, well, one of your talks is writing greener software, even when you are stuck on prem, and you talk about the fact that not everybody has the option to move into the cloud.So what, then? What do you do if you can't move into the cloud?Charles Humble: Yeah, that's, it is such an interesting question, that. So obviously there are things you can't do or can't do very easily, and one of the most obvious of those is you can't choose green locations on the whole if you're running stuff in your own data centers. So again, going back to these easy wins, an easy win is to use something like Electricity Maps, which is a tool which basically tells you what the energy mix is in a given region.Oh.And then you say, "I shall run my workloads there 'cause that looks good." There's a little bit more to it than that. You kind of want a location that not only has the greenest energy mix at the moment, but also has like credible plans for that to keep improving.Anne Currie: Yeah.Charles Humble: Obviously that's really hard to do with your own data centers.Anne Currie: Yeah.Charles Humble: As a rule of thumb, you probably don't want to be building new data centers if you can help it because, pouring concrete is not great. There's a lot of costs associated. That said, you do have some advantages in your own data centers 'cause you have some things that you can control that people on cloud can't. I would say, I mean, you know, like being honest about it, if you can move things to public cloud, that's probably going to be better. But if you can't, there are still things you can do. So one of those things is you have control over the lifetime of your hardware. This gets a little bit complex, but it's basically down to, so hardware has an embodied carbon cost.That's the cost that it takes to construct it, transport it, dispose it at the end of its use, like useful lifetime. I mean, it also has the cost it takes to charge it. Now for your laptops, your mobile phones, your end user devices, the embodied carbon absolutely dwarfs the carbon cost used to charge it in its lifetime.Anne Currie: Yeah.Charles Humble: What we talk about with end user devices is like basically extend the life. Say, you know, 10 years or something like that, keep it. We want to make less of them, is really the point. Servers and TPUs and GPUs and those sorts of things, it's a bit more complicated. The reason it's a bit more complicated is because we are getting an awful lot better at making more efficient servers for all sorts of reasons. so what that means is the trade-offs with each new generation is more complicated. As an example, a lot of your energy use in your data center is actually gonna be cooling. So a CPU or a TPU that's running less hot requires less cooling. That's a big win. These sorts of things are sufficiently important that actually, until gen AI came along, so really three or four years ago, though we were adding massive amounts of compute, the emissions from our data centers was pretty flat. I mean, it was climbing, but not much. So the point here with your own data centers is you have control over that lifetime. So what you can do is you can do the calculations, assuming you can get the embodied carbon costs from your suppliers, you can do the calculations and think about, "well, how long do I keep this piece of hardware going before I turn it over?" Now, I don't want to give you a heuristic on that because it's kind of dangerous, but it's probably not 10 years, right?It's probably five years-ish. Maybe something like that, but run the maths. But it's absolutely something you can do. You can also take advantage of things like your servers will have power saving modes that you probably don't turn on because we used to worry about that kind of thing.'Cause we have this like, again, one of our old assumptions. We used to imagine that if you power a server down, it might not come back quite the same. Actually that's kind of still true, but, you know, it's fixable, right? So enable power saving across your entire fleet, that will make a huge difference, particularly if you've over provisioned, like we were saying earlier, right? 50% of your servers are idle. Well, they can be asleep all the time, and that helps. It's not the same as turning 'em off, but helpful. You can also look at voltage ranges. So your hardware will have a supported voltage range, and you've probably never thought about it, and I'll admit I hadn't until quite recently.But actually again, if you're running at scale, if you send the lowest voltage that your servers will support, at a big scale that will a considerable difference. And then again, some of the other things we talked about, your CPU choice again, will make a difference. So think about, you know, "do I need to be buying Intel servers all the time, or could I be buying AMD ones or ARM ones?"And also look at your cooling. But that's a whole, that's a whole nother complicated topic for all sorts of reasons. Often, well, in brief, some of the most energy efficient methods of cooling have their own set of problems, which make the trade offs really hard. So, like water-based cooling tends to be very efficient,tends not to be great for local water tables.Anne Currie: Yeah.Charles Humble: It's, complicated. But, yeah, as I say, there are, so, there are a lot of things are that are definitely harder. And if you have a choice, if you're running in like a hybrid environment, chances are if you have a choice of going public cloud or own data center, public cloud is probably better. It's absolutely in Google and AWS and Microsoft's interests to run their data centers as efficiently as possible. 'Cause that's where their cloud profit margin is, right?Anne Currie: Absolutely.Less Charles Humble: it's costing them to run the, you are still paying the same amount, the more money they make. Anne Currie: Well, I, and I think I always laugh when I see the numbers on Graviton. So when AWS attempt to, persuade you quite correctly to, if you can move from Intel chips to run on, to run your applications onto ARM chips. They say, "oh, this will save, 40% on your hosting bill and 60% on your carbon emissions."And you think, I think you've just pocketed quite a lot, a big. That suggest to me you've just pocketed quite a nice upgrade in your, in your, profitability. And I have no problem with that whatsoever, as things get better, I have no problem with making profits out of it. So I'm gonna pick you up on something that, I think everything you've said there is very true.And I'm gonna take a slightly different take on it, which is that remember what that, what Charles is saying, there is quite detailed stuff about not everybody here will be a hardware person and that you will have specialists within your organization who can do all these hardware judgements.The interesting thing is that they can. And it is always the case that if you can, if you have specialists in your organization, the best way to do better is to persuade them that they want to do better. So, if, you could persuade your specialists that actually to actually take an interest in this and to find ways of improving the efficiency of your systems, cutting the carbon emissions, they will do better at it than you will.Charles Humble: 100%.Anne Currie: Best thing you could do is persuade them to focus their in giant specialist brains on the subject because the likelihood is that the real issue is they probably aren't thinking about it, or they probably don't, you know, they, it is not top of their mind. They maybe think they're not even allowed to start thinking about it.If it at a high level, you can actually get your specialists to turn their attention to these. efficiency issues to these carbon reduction issues, that's so much more effective than you going and reading up on it yourself. Get them involved. Go out and talk to people. Persuade, use your powers of persuasion, because, what you should take away from Charles, what's lots of people listening should take away from what Charlesjust said then is that there is a lot of stuff that can be done by your specialist teams that they might not be thinking about doing, or they might not be, they might feel they don't have the time or focus to do. You can potentially help them by focusing them or giving them some budgets or some time to work on it.Charles Humble: Definitely. Absolutely. Yeah. Yeah. No, I'm a big believer in specialization in our industry, and I think actually this idea that we are almost know everything isn't, is not helpful. Like absolutely, if you've got hardware people, go and tell the hardware people, and it's a thing of incentivizing.It's like, you know, "we can save money by doing some of these things, or we can reduce our carbon by doing some of these things, and those are good things to do." Yeah, a hundred percent agree with all of that. No disagreements at all.Anne Currie: Yeah, no, it's interesting isn't it, that most of human progress has come from the realization that specialists kick the butt of generalists. And I'm a generalist, so you know, I wish it wasn't true. My job is to kind of encourage specialists to be specialists and, you know, this is not new news.It was the, it's the theme of Adam Smith's the Wealth of Nations that he wrote in the 1770s about why the industrial revolution was happening. It wasn't to do with any kind of technology or anything else. It was the discovery that specialists kick the butt of generalists.Charles Humble: Hundred percent, yes.Anne Currie: But now we're gonna get to the final tricky question that we have for you, Charles, that you'll be thinking about. You've been thinking about, so I'm, your work often emphasizes the importance of transparency, knowing the carbon footprint of what we build. What tools and practices do you recommend for people to do that? Charles Humble: Oh, that is a hard question. Yes. Frustratingly hard actually, we, so the first thing is we often end up using proxiesand the reason we end up using proxies is 'cause measurement is genuinely quite difficult. So cost is a quite a good proxy. In Bill Gates' book, blanking on the name of the book, oh, How to Avoid a Climate Disaster, Anne Currie: Oh yeah. Which is excellent. And again, everybody listening to this should be reading it. Yeah.Charles Humble: Absolutely. So he, in that book, he does a bunch of calculations, which he calls green premiums and they'rebasically the cost of going green.Now, He doesn't do one for our industry, but I would wager, because we are also profligate, I would wager that our green premium, and I haven't worked this out, I will admit it, but I would think our green premium is probably a negative number.So, that's to say,going green is probably cheaper for us. Right.Anne Currie: I agree.Charles Humble: So cost is a very good proxy. It is an imperfect proxy. One of the reasons it's an imperfect proxy is because, for example, if you're running a green energy mix, that's not going to be reflected in your electricity bill at the moment. That may change, but at themoment it doesn't happen.Right. So it is imperfect, butAnne Currie: Well, it doesn't happen in some places and in other places it does. So if you are on prem and you're in a country with dynamic pricing like Spain or zonal pricing, like talking about the UK having in future, that's still very up in the air, then it does. But if you're in the cloud, even in those areas, it doesn't at the moment.Charles Humble: Absolutely. But nevertheless, 'cause as I was saying, you know, like probably half of your servers are doing nothing useful. So cost is a pretty good starting point. Another thing is CPU utilization. So there's something we haven't really talked about, which is this idea, Google calls it energy proportionality,Anne Currie: Yeah.Charles Humble: the observation that when you turn a machine on, you turn a server on, it has a static power draw, and that static power draw is quite a lot. How much depends on how efficient the server is, but it might be 50% or something like that. So when it's sitting idle, it's actually drawing a lot of power. The upshot of this is you'd usually have like an optimum envelope for a given server, and that might be somewhere between 50 and about 80%.It may be a bit lower than that depending on how good the chips are. Above about 80% you tend to get key contention and those sorts of things going on. Not great. But around and about that operating window. So it's again, keeping your CPU utilization hard but not, high rather, but not maxed out is another good one.Hardware utilization is another good one. Beyond that, so all of the cloud providers have tools of varying usefulness. Google's carbon footprint tool is probably best in class, at least in my experience. I think they take this stuff very seriously and they've done a lot of very good work.Microsoft Azure tools are also pretty good. AWS's ones, so they have just released an update literally as we're recording this, and I hadn't had a chance to go and look at what's in the updated version. I'm going to say I think AWS is still a long way behind their competitors in terms of reporting.Anne Currie: Yeah.With Charles Humble: a slight proviso that I hadn't looked at what's in the new tool properly. But again, there, there are all things there that you can use. There's a tool called Cloud Carbon Footprint, which is an open source thing, by ThoughtWorks and that's quite good. It will work across different cloud providers, so that's kind of nice. You could probably adapt it for your own data centers, I would imagine. Of course the GSF has a formula for calculated carbon intensity as well. So that's more of a sort of product carbon footprint or lifecycle assessment type approach. It's not really suitable for corporate level accounting or reporting or that sort of thing, but that's quite a good tool as well. And there are a variety of other things you can use, but as I say, if we're talking the very beginnings, you probably start with the proxies. If you've got a choice of cloud provider, think about the cloud provider that gives you the tooling you need.And you know, that might, again, going back to our assumptions, time was you would choose AWS. Maybe you shouldn't be choosing AWS now, or at least maybe you should be thinking about is AWS the right choice.At least until they, you know, until they sort put their house in order a bit more. These are things, questions that we can reasonably ask. And in general, if you are working with vendors, whether they're AI vendors or whatever, it is entirely reasonable to go and say, "well, I want to know what your carbon story looks like." And if they won't tell you, go somewhere else. In the case of AI, none of the AI companies will tell you. They absolutely won't. And so my advice, if you're looking at running generative AI, other than. Everything we just said applies to AI, like it applies to everything else. There are a bunch of very specific AI related techniques, distillation, quantization, pruning, those sorts of things. Fine. But really my advice is well, using an open source model, and look at something like the ML leaderboard from ml.energy leaderboard, which will give you an idea of, what the carbon cost looks like. And don't use AI from a company that won't tell you, would be my advice. You know, and maybe we can embarrass some of these companies into doing the right things. You never know.Anne Currie: Be nice, wouldn't it? It's so, it's interesting, the, this, so in April, Eric Schmidt got up in front of the US government in one of their, in one of their, committees and said, well, you know, if we, at the current rates, AI is going to take up 99% of the grid electricity in the US.And you think "it's interesting, isn't it," because that's not a law of nature. There are plenty of countries that are looking at more efficient AI, so China, are certainly looking at more efficient AI. They don't want, they want to compete. They wanna be able to run AI because in the end, the business that's going to collapse if AI requires 99% of the US grid is AI because it cannot, you know, it's kind of, if something cannot go on, it will stop. Charles Humble: It's a desperate source of frustration for me because it is completely unnecessary.Anne Currie: Well, it's, you just have to be a bit efficient.Charles Humble: Just in brief, 'cause again, this is like a whole separate podcast probably,but just in brief, there are a bunch of things that you can doAnne Currie: Absolutely.Charles Humble: that make a huge difference, both when you are collecting your data, when you are training your models, when you're running them in production afterwards. I have just done a piece of work for the News Stack on federated learning, and in the process of doing that, I talked to somebody called Professor Nick Lane, who is at Cambridge University, and he talked about, so one of the solutions to the data center cooling problem, which we touched on earlier, is basically what you do with the waste heat. And there are lots of companies in Europe that are looking at using it for things like heating homes or using, you know, heating municipal swimming pools, that sort of thing, right? You can't do that with an Amazon or a Google or a Microsoft facility, because you have to construct the data center close to where the waste is gonna be used.But there are lots of these small data centers, particularly in Europe. There are companies like T Loop that are doing a lot of this work. And he made the point that with federated learning, you can actually combine these smaller facilities together and then, you know, be training potentially very large models on much, much smaller data centers, which I thought was fascinating. There's a guy called, Chung is his surname, and apologies to him, i'm blanking on Jae-Won Chung. He's done some extraordinary work looking at, so when we split stuff across GPUs,that has to be synchronized, right? So we divide the workload up because it's too big to fit in a GPU and we split it across a bunch of different GPUs and we run all of those GPUs at full tilt, but we don't have to. Because we can't divide the workloads up evenly.So you have some workloads that are tiny but this GPU is still running at full power, and what he worked out was, well, if we slow those GPUs down, the job will still end at the same point, but it'll use a lot less energy. So he's built something called Perseus, on his tasks with things like Bloom and GPT-3, they're about, it's about 30% less energy use just from using thatfor exactly the same throughput. So there's no throughput loss, there's no hardware modification. The end results are exactly the same, and you just save 30% of your energy bill, which is a big deal.Then you go, as I say, things like distillation and quantizing and pruning and shrinking your model size, all of that stuff.So it frustrates me because it's so unnecessary. I think we need a carbon tax and I think the carbon tax needs to be prohibitive. And I think, you know, bluntly, I think companies like OpenAI should be pushed outta business if they don't get their house in it's time. I thrilled.Hannah Richie's book, not The End of the World, which is my, possibly my favorite book on climate. And again, it's a book, everyone haven't read it, go and read it. She has a wonderful quote in there where she says, "I've talked to lots of economists and all of the economists I've spoken to agree that we need some sort of carbon tax."And then she goes on to say, "it's maybe the only thing that economists agree on," which I thought was a fine and excellent line.Anne Currie: It is really interesting 'cause I, we disagree slightly on, you're not a huge AI fan. I'm a massive AI fan. I want AI and I also want a livable climate. And they are not mutually exclusive. They can be done. I mean, you have, you don't love AI, you don't love AI as much as I love AI, but we are both in agreement that it is not physically impossible to have AI and effective control of climate change because as you were saying about the federated learning and, you know, optimizing your GPU towards the bottleneck tasks and then things like that, as long as you, workloads that are time insensitive that can be shifted in time and maybe delayed and maybe separated and then glob together again,they're very good workloads to run on renewable power, which is variably available. So in fact, AI is potentially incredibly alignable with the energy transition. The fact that we don't always do it is a travesty and it's so bad for AI as well as being bad for the planet.Charles Humble: I want to push back slightly on you saying I'm not a fan of AI. So I have. Quite strong concerns specifically about generative AI that are ethical and moral as well as environmental.Anne Currie: Which I can see.Charles Humble: And in essence it comes down to the fact you are taking a bunch of other people's work and you are building a machine that plagiarizes that work and you are not compensating those people for it. And you are also, basically you have to do tuning of the model. So reinforcement learning with human feedback and the way that, that's done is pretty horrifying when you dig into it. It usually involves, you know, people in places like Kenya being paid $3 an hour to look at the worst contents of the internet for day after day.I mean, one can imagine what that does to you. So I have quite specific reservations with generative AI, the way that we are doing it. As it goes, I think there are ways that we could build generative AI that wouldn't, I wouldn't have these ethical problems with, that we're not doing. More generally, think generative AI is interesting. I don't know that it's useful, but I do think it's interesting. And more broadly, I'm not against AI at all. I'm like, you know, I've done work with a company that, for example, is using AI to look at, , increase the window that you can treat stroke patients with, by like hours.And it's amazing. Amazing work. So they're basically doing image processing to identify different types of stroke. And some stroke patients, the window is much wider. So, you know, wethink of it as being 4.5 hours but it's much bigger. Stuff likethat. There's, and, as you say, like grid balancing is gonna get more complicated with renewables, and AI probably has a role to play there.And I'm not anti. I'm not anti, I just think that there are things that we are doing as an industry which are reckless and ill-judged and you know, in my tiny little way I want. I mean, I'm aware that it's like, you know, blowing a kazoo in a thunderstorm, it's quite amusing, but it doesn't actually do much for anybody. But I, in my own little way, I want to be sort of beating the drum. As an industry, I think we need to get better. Right. And part of the reason I think we need to get better is because the work that we do has a huge impact on the whole planet now and on society and all sorts of things. And we are still like acting like we're a little cottage industry and what we do is inconsequential but it's not true. So my reservations with gen AI is, I think it's being done in a desperately irresponsible way, but that doesn't mean it has to be. It just means that's what we're doing. And hey, I might be wrong. You know, I'm not an ethicist. I just like, I just have reservations. Also, I am a writer. And a musician, right?So, you know, like I do have skin in the game. I kind of want generative AI not to work. 'Cause otherwise I don't really have a living anymore, which is a bit of a worry. So, you know, I'm not a neutral observer on this at all, but I just think the way we're doing this is morally, ethically dubious, as well as being very bad for the climate. And I don't think it has to be any of those things.Anne Currie: Yeah, I, so it's an interesting, we have a slightly different, 'cause I'm also a writer and a painter. but I've always been so rubbish at making money out of writing and painting that I don't really, don't have anything to say. So we have, that's, but that is my own fault.A little bit. Charles Humble: The last question, I'm looking at your script now. Sorry. 'cause it's a shared Gigle doc, and your last question is about, so I write in my free time in a band called Twofish. And the question is, if you could score the soundtrack for a more sustainable future, what would it sound like? Anne Currie: I forgot about the question. Yeah.Charles Humble: Interesting have get it in. So we did the opposite thing actually. We did, so there's a piece on the last two Fish album, called Floe, and that was my kind of, I started, everything is written as, by two of us. But I started that one and when I started it, what I was trying to do is describe what climate breakdown might sound like in music.That was kind of my starting point. Not sure anyone hearing it would get that, but what I did was I went and recorded a bunch of like, field recordings. So, you know, California wild fires and that sort of thing. Tune them all to A flat minor as you do, and then wrote this very dark, scary,that gets a bit drum and bassy as it goes on. It's very black and industrial and dark and quite grim and I rather like it. So I think we just have to go the opposite, right? We'd have to go the other end of this. Anne Currie: So Twofish, what's the name of your last album? In fact, which album would you recommend? Charles Humble: It's called At Least a Hundred Fingers. That's the last album. And, yeah, Twofish is the band, TWA as in the encryption algorithm, fellow nerds. So yeah, so with this one, the climate break, with the sustainable future one, I think some of my favorite composers, classical composers, would be like early, late 19th, early 20th century.People like that. They were very inspired by the natural world, and they tended also to draw a lot on their, the folk tunes of the countries where worked. So I think melodically your, my starting point might be to go to a folk tune, and then use very traditional instruments. So have like a, maybe a string section, you know, sort of violins, violas, cello. So try and get some of that lift and air and that sort of thing into it. And then have the, more electronic stuff for stuff that I typically do, be very kind of intricate, interconnected, kind of supporting lines so that you have something melodic that is folk, quite traditional instruments, and then this kind of sense of interconnectedness and sort of mechanisms working, something like that. I might have a go at that actually. Perhaps there'll be a third Twofish album that has that on it. You never know. Yeah, that. If you want to look my stuff up, so my website, my company is Conissaunce com, www.conissaunce.com. I'm Charles Humble on LinkedIn. I'm alsoAnne Currie: There will be, we'll have links below in the show notes.Charles Humble: So yeah, you can find me on all of those. And you can find the music there as well.Anne Currie: Excellent. And I really recommend the albums. I like them a lot. They're great. Charles Humble: Thank you.Anne Currie: So thank you very much, and thank you to all the listeners today. As reminder again that all the links that we've talked about today, we have slightly overrun, will be in the show notes below. So, until the next time, thank you very much for listening and happy building Green Software.Charles Humble: Thank you very much indeed for having me. It's been a pleasure. Thanks for listening and goodbye.Anne Currie: Goodbye. Chris Skipper: Hey everyone, thanks for listening. As a special treat, we're going to play you out with the piece that Charles was talking about, Floe by Twofish. If you want to listen to more podcasts by the Green Software Foundation, head to podcast.greensoftware.foundation to listen to more.Bye for now. 
undefined
May 1, 2025 • 18min

Backstage: Green AI Committee

In this special backstage episode of Environment Variables, producer Chris Skipper spotlights the Green AI Committee, an initiative of the Green Software Foundation launched in 2024. Guests Thomas Lewis and Sanjay Podder share the committee’s mission to reduce AI's environmental impact through strategic focus on measurement, policy influence, and lifecycle optimization. The episode explores the committee’s approach to defining and implementing “green AI,” its contributions to public policy and ISO standards, and collaborative efforts to build tools, best practices, and educational resources that promote sustainable AI development.Learn more about our people:Chris Skipper: LinkedIn | WebsiteThomas Lewis: LinkedIn | WebsiteSanjay Podder: LinkedIn | WebsiteFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterResources:Green AI Committee [00:00]Green AI Committee Manifesto [03:43]SCI for AI Workshop [05:28]Software Carbon Intensity (SCI) Specification [05:34] Green Software for Practitioners (LFC131) - Linux Foundation [13:54]Events:Carbon-Aware IT: The New Standard for Sustainable Tech Infrastructure (May 5 at 6:00 pm CEST · Virtual) [15:53]Inside CO2.js - Measuring the Emissions of The Web (May 6 at 6:30 pm CEST · Hybrid · Karlsruhe, BW) [16:11]Monitoring for Software Environmental Sustainability (May 6 at 6:30 pm CEST · Virtual) [16:45]Green IO New York (May 14 - 15 · New York) [17:02]If you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:​Chris Skipper: Welcome to Environment Variables, where we bring you the latest news from the world of sustainable software development. I'm the producer of this podcast, Chris Skipper, and today we are thrilled to bring you another episode of Backstage, where we dive into the stories, challenges, and triumphs of the people shaping the future of green software. In this episode, we're turning the spotlight on the Green AI Committee, a pivotal initiative approved by the Green Software Foundation in March, 2024. With the rapid rise of AI, this committee has been at the forefront of shaping how companies innovate sustainably while reducing AI's environmental impact . From driving policies and standards, to fostering collaborations and crafting new tools, the Green AI Committee is charting a path toward a more sustainable AI future. Joining us today are Thomas Lewis, the founder of the committee, along with co-chair Sanjay Podder.Together, they'll share insights on the committee's goals, their strategies for tackling AI's carbon footprint, and the critical role this initiative plays in ensuring AI development supports global net zero ambitions. And as always, everything we discuss today will be linked in the show notes below. So without further ado, let's dive into our conversation about the Green AI Committee.First, I'll let Thomas Lewis introduce himself.Thomas Lewis: Hi, I'm Thomas Lewis. I'm a green software developer advocate at Microsoft, and excited to be here. I also work in artificial intelligence, spatial computing, and I've recently been involved in becoming a book nerd again.Chris Skipper: My first question to Thomas was, what inspired the creation of the Green AI Committee and how does it aim to shape the GFS approach to ensuring AI innovation aligns with sustainability goals? Thomas Lewis: Yeah, so we noticed that we were getting a lot of inquiries. We were getting them from legislators and a lot of technologists. Everybody from, you know, people working at your, you know, typical enterprise to folks who were doing research at universities and learning institutions.And they were reaching out to try to get a better understanding of how the green software principles that we talk about and those practices applied to this growing impact of AI. It was not unusual to see on social media a lot of interest in this kind of intersection of green software or sustainability with artificial intelligence.And, you know, this kind of shaped the GSF's approach because in a way we take a slow, methodical approach to thinking about the challenges of green AI and we tend to bring in a lot of experts who have thought about this space from quite a few different viewpoints. And we don't just look at it in a binary way of good or bad.And I think a lot of times, especially online, it can be like, well, you know, AI is, you know, burning the planet down. And you know, and that the resources needed to run these AIs are significant, which is not untrue. And that's the thing I appreciate with the GSF is that you know, we look at those elephants in the room.But with acknowledging those challenges, we also look at AI to help support sustainability efforts by, again, looking at it from those different vectors and then thinking of a viewpoint and also backing it up with the appropriate tools, technologies, and education that may be needed.Chris Skipper: The committee's manifesto emphasizes focusing on reducing the environmental impact of AI. Could you elaborate on why this focus was chosen rather than areas like AI for sustainability or responsible AI?Thomas Lewis: That's a good question. We tend to look at things from a variety of vectors and don't necessarily limit ourselves if we think it is important to dig into these other areas. But one of the things I do like, about the GSF is that typically when we start a committee or start a project, we always start with a workshop.And what we do is we ask for a lot of experts to come to the, you know, virtual table, so to speak, and walk actually through it. So, everyone gets a voice and gets to put out an opinion and to brainstorm and think about these things. And these workshops are over multiple days. And so, typically the first day is kind of like just getting everything on the board.And then the, you know, second time that we get together is really about how to kind of say, "okay, how do we prioritize these? What do we think are the most important? What should we start on first? And then what are the things that, you know, we put on the backlog?" And then the third, you know, one is typically where we're really getting sort of precise about "here's where our focus is going to be." So the conversation is always very broad in the beginning, right? Because you have all of these people coming to the table to say what's important. But as we kind of go through that, so, after a lot of that discussion, we decide on a prioritized focus. But of course we'll come back to others as we iterate because there are gonna be opportunities where, hey, maybe it is more important that we focus on a certain thing.So, like, for example for the GSF, it is about building out the SCI for AI. So, if you're familiar with our Software Carbon Intensity spec, that now is a standard, that is one of, kind of the projects that came out of that workshop and that thinking, because, you know, first thing you kind of have to do if you wanna make a change in what you do is you have to measure it, right?You have to measure what your carbon intensity is, whether it's AI or gaming or blockchain or what have you. And so I think by having this process of doing these workshops that's really what gets us to our priority. So I don't think that there's always sort of a kind of a crisp thing of like, why we did this or not do this, or why we prioritize it a way.It's really that kind of collective coming together, which I think is what really makes the foundation very powerful because everyone has a voice in it.Chris Skipper: The committee recently responded to a bill drafted by US Senators to investigate AI's environmental impact. How do you see the role of the Green AI Committee in shaping public policy and regulations?Thomas Lewis: I've always seen the Green AI Committee's role in this as a trusted advisor, backed up with technical credibility and intellectual honesty. Our intent is not to rubber stamp legislation or just be another endorsement on a bill, but to review bills and papers that come to us with experts in this field and to call out things that we think are important to sustainability or also question things. What I really have appreciated is what comes to us is there has never been an intention for us just to say, "this is good" and give the check mark. But it really is, has been like, "hey, we want your feedback. We wanna understand how we can make these things better for our constituents."And the other thing is that the committee also works very closely with our own policy group within the GSF because many of the members, including myself, don't work with legislators and politicians normally. And so there's a vernacular to the things that they talk about and how they approach things.And so our policy group is also very helpful in this. So, you know, our committees aren't based on, "hey, everything related to AI will come through this committee." We have a lot of different groups, and those groups may be like the policy group, it may be the open source projects that are within the GSF and some of our education opportunities that are there.But yeah, I would say from my perspective the role is mostly as a trusted advisor. And I think that if that is how people reflected the relationship regarding policy and advocacy, I would think that we are doing a good thing.Chris Skipper: From the initial stages of founding the Green AI Committee to where it stands now, what have been the most valuable lessons learned that could guide other organizations aiming to promote sustainability in AI?Thomas Lewis: I would say, first take a thoughtful approach in how you wanna approach things. Not only is green software a significant amount of tech, people and communities, but AI builds on top of that and has its own things, and the innovation is happening way faster than most people can keep up.And so you've gotta take the time to figure out what you wanna focus on first. You can't say you're just gonna try to cover every angle and every thing. Second, I would say take a less dogmatic approach to your efforts. It's easy to say "things should be this way," right? Or, "hey, we're gonna do something 100%, or it's considered a failure."This space is rapidly changing. This environment especially. So what you have to do is kind of take the time to get a wide variety of insights and motivations, and then methodically figure out what a hopefully optimal approach is going to look like. And then the third which, you know, may not be just related to, you know, green software and AI, but surround yourself with people who are smarter and more knowledgeable than yourself.One of the things that I absolutely love being on this committee is there are just super smart people that I get to work with, like the people that are on this podcast. And I learned so much because we all have different contexts, we have different viewpoints and we have various experiences, right?So we've got you know, folks who are in big companies and people who are in small companies and people who are just starting their sustainability journey. There's people who have been doing this for a long time. We have students, we have researchers. There's all kinds of people. So the more that you can kind of understand where a lot of people are coming from,and again, what their context is, you're gonna find that you're gonna really be able to do a whole lot more than you have been able to before. And you may get ideas from places that you think you didn't before. And again, this isn't just with the Green AI Committee, I think this is in life, you know, and again, if you surround yourself with people who are smarter and more knowledgeable than yourself I always think that you're going to be in a better place and you'll end up being a better person for it.Chris Skipper: Thanks to Thomas for sharing those insights with us. Next up we have Sanjay Podder. Sanjay is not only co-chair of the Green AI Committee, but also host of our other podcast here at the Green Software Foundation, CXO Bytes. My first question to Sanjay was how does the Green AI Committee contribute to reducing AI's carbon footprint?And can you share specific strategies or tools the committee is exploring to achieve these goals?Sanjay Podder: The Green AI Committee brings together experts from across the industry to shape what it truly means to build AI sustainably. Our goal is to not only define green AI, but to make it practical and actionable for developers, data scientists, and technology leaders alike. We started by creating a simple developer-friendly definition of green AI.One that anyone in the ecosystem can understand and apply. But we did not stop there. We have taken a lifecycle approach breaking down the environmental impact of AI at every stage from data processing and model training to deployment and inference. This helps pinpoint where emissions are highest and where optimization efforts can have the biggest impact.We are also actively working on strategies and tools to support these goals. By embedding best practices across the AI lifecycle, we are driving a shift towards AI systems that are not just powerful, but also responsible and sustainable.Chris Skipper: The manifesto highlights the importance of partnerships with nonprofits, governments, and regulators.Could you share some examples of how collaborations have advanced the Green AI committee's mission?Sanjay Podder: The committee understands that tackling AI's environmental impact demands broad collaboration with various stakeholders to create comprehensive standards. These standards will focus on transparency software and hardware efficiency and environmental accountability. Engaging a wide range of AI and ICT organizations will help build consensus and ensure that sustainability is a core design principle from the start.Chris Skipper: The committee is tasked with supporting projects like the development of an ISO standard for measuring AI's environmental impact. What milestones have been achieved in this area so far, and what are the next steps?Sanjay Podder: Despite rapid advancement in AI, practitioners and users currently lack clear guidance and knowledge on how to measure, reduce, and report, AI impacts. This absence limits public awareness and hinders efforts to address AI's environmental footprint, making it more challenging to develop AI sustainably.To address these challenges, the committee is actively pursuing initiatives to provide practitioners and users with the necessary knowledge and tools to minimize AI's environmental footprint. The goal is to increase awareness of green AI principles and promote sustainable AI development practices. For example, Green AI Practitioners course to increase the awareness of green AI and understanding of the implications of AI development on the environment.It'll explain the fundamental principles of green AI developments and solutions and, provide practical, actionable recommendations for practitioners, including guidelines for measurement. Software Carbon Intensity for AI to address the challenges of measuring AI carbon emission to the AI lifecycle, and support more informed decision making and promote accountability in AI development.Chris Skipper: And finally, what are some of the long-term goals for the Green AI Committee, and how do you see these objectives evolving with advancements in AI technology? Sanjay Podder: Our goals are evolving to reduce the ecological footprint of AI systems. Green AI isn't just a standalone solution. It's a core component of a broader sustainability ecosystem. As we advance in this mission, we urge more organizations to join the conversation and help build a more sustainable future for AI, developing and regularly updating standardized methodologies to measure AI's environmental impact will be essential for driving sustainable and scalable AI development.Chris Skipper: Thanks to Sanjay for those insights. Next up, we have some events coming up in the next few weeks that we'd like to announce. First up, a virtual event from our friends at Electricity Maps, Carbon-aware IT: The new standard for sustainable tech infrastructure, on May the fifth at 6:00 PM CEST.Explore how organizations optimize IT infrastructure to meet their net zero goals. Then for those of you in Germany, there is a hybrid event in Karlsruhe run by Green Software Development Karlsruhe, called Inside CO2.js - Measuring the Emissions of the Web, happening on May the sixth at 6:30 PM CEST.This is also a hybrid event, so there will be an online element. Learn how to make emissions estimates and use CO2.js, a JavaScript library from regular environment variables host, Chris Adams and the Green Web Foundation. Then we have another event that is purely virtual happening on May 6th at 6:30 PM CEST, called Monitoring for Software Environmental Sustainability.Learn how to incorporate software sustainability metrics into your monitoring system. And finally in New York, the Green IO and Apidays conference, green io, New York, happening from May the 14th until May the 15th. Get the latest insights from thought leaders in tech sustainability and actionable hands-on feedback from practitioners scaling green IT. So we've reached the end of this special backstage episode on the Green AI Committee Project at the GSF. Thanks to both Thomas and Sanjay for their contributions. I hope you enjoyed the podcast. To listen to more podcasts about green software, please visit podcast.greensoftware.foundation, and we'll see you on the next episode.Bye for now. 
undefined
4 snips
Apr 24, 2025 • 35min

The Economics of AI

Chris Adams sits down in-person with Max Schulze, founder of the Sustainable Digital Infrastructure Alliance (SDIA), to explore the economics of AI, digital infrastructure, and green software. They unpack the EU's Energy Efficiency Directive and its implications for data centers, the importance of measuring and reporting digital resource use, and why current conversations around AI and cloud infrastructure often miss the mark without reliable data. Max also introduces the concept of "digital resources" as a clearer way to understand and allocate environmental impact in cloud computing. The conversation highlights the need for public, transparent reporting to drive better policy and purchasing decisions in digital sustainability. Learn more about our people:Chris Adams: LinkedIn | GitHub | WebsiteMax Schulze: LinkedIn | WebsiteFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterResources:Energy Efficiency Directive [02:02]German Datacenter Association [13:47] Real Time Cloud | Green Software Foundation [22:10]Sustainable Digital Infrastructure Alliance [33:04]Shaping a Responsible Digital Future | Leitmotiv [33:12]If you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:Max Schulze: The measurement piece is key. Having transparency and understanding always helps. What gets measured gets fixed. It's very simple, but the step that comes after that, I think we're currently jumping the gun on that because we haven't measured a lot of stuff. Chris Adams: Hello and welcome to Environment Variables, brought to you by the Green Software Foundation.In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect. Candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software. I'm your host, Chris Adams. Hello and welcome to another edition of Environment Variables, where we bring you the latest news and updates from the world of sustainable software development.I'm your host, Chris Adams. We're doing something a bit different today. Because a friend and frequent guest of the pod, Max Schulzer is actually turning up to Berlin in person where I'm recording today. So I figured it'd be nice to catch up with Max, see what he's up to, and yeah, just like catch up really.So Max, we've been on this podcast a few times together, but not everyone has listened to every single word we've ever shared. So maybe if I give you some space to introduce yourself, I'll do it myself and then we'll move from there. Okay. Sounds good. All right then Max, so what brings you to this here?Can you introduce yourself today? Yeah. Max Schulze: Yeah. I think the first question, why am I in Berlin? I think there's a lot of going on in Europe in terms of policies around tech. In the EU, there's the Cloud and AI Development Act. There's a lot of questions now about datacenters, and I think you and I can both be very grateful for the invention of AI because everything we ever talked about, now everybody's talking about 10x, which is quite nice.Like everybody's thinking about it now. Yep. My general introduction, my name is Max. For everybody who doesn't know me, I'm the founder of the SDIA, the Sustainable Digital Infrastructure Alliance. And in the past we've done a lot of research on software, on datacenters, on energy use, on efficiency, on philosophical questions around sustainability.I think the outcome that we generated that was probably the most well known is the Energy Efficiency Directive, which is forcing datacenters in Europe to be more transparent now. Unfortunately, the data will not be public, which is a loss. But at least a lot of digital infrastructure now needs to, Yeah,be more transparent on their resource use. And the other thing that I think we got quite well known for is our explanation model. The way we think about the connection between infrastructure, digital resources, which is a term that we came up with and how that all interrelates to software. Because there's this conception too that we are building datacenters for the sake of datacenters.But we are, of course, building them in response to software and software needs resources. And these resources need to be made somewhere. Chris Adams: Ah, I see. Max Schulze: And that's, I think what we were well known for. Chris Adams: Okay. Those two things I might jump into a little bit later on in a bit more detail.So, if you're new to this podcast, my name is Chris Adams. I am the policy chair in the Green Software Foundation's Policy Working Group, and I'm also the director of technology and policy in the confusingly, but similarly named Green Web Foundation. Alright. Max, you spoke about two things that, if I can, I'd like to go dive into in a little bit more detail.So, first of all, you spoke about this law called the Energy Efficiency Directive, which, as I understand it, essentially is intended to compel every datacenter above a certain size to start recording information, and in many ways it's like sustainability-adjacent information with the idea being that it should be published eventually.Could we just talk a little bit about that first and maybe some of your role there, and then we'll talk a little bit about the digital resource thing that you mentioned. Max Schulze: Yeah. I think on the Energy Efficiency Directive, even one step up, europe has this ambition to conserve resources at any time and point.Now, critical raw materials are also in that energy efficiency. Normally, actually, this law sets thresholds. Like it is supposed to say, "a building shall not consume more power than X." And with datacenters, what they realized, like, actually we can't set those thresholds because we don't know, like reliably how many resources have you consumed?So we can't say "this should be the limit." Therefore, the first step was to say, well, first of all, everybody needs to report into a register. And what's interesting about that, it's not just the number that in datacenter land everybody likes to talk about, which is PUE, power usage effectiveness. And so how much overhead do I generate with cooling and other things on top of the IT, but also that it for the first time has water in there.It has IT utilization ranges in there. It even has, which I think is very funny., The amount of traffic that goes in and out of a datacenter, which is a bit like, I don't know what we're trying to measure with this, but you know, sometimes you gotta leave the funny things in there to humor everybody. And it goes really far in terms of metrics on like really trying to see what resources go in a datacenter, how efficiently are there being used, and to a certain degree also what comes out of it. Maybe traffic. Yeah. Chris Adams: Ah, I see. Okay. Alright, so it's basically, essentially trying to bring the datacenter industry in line with some of other sectors where they already have this notion of, okay, we know they should be this efficient, and like we've had a lack of information in the datacenter industry, which made it difficult to do that.Now I'm speaking to you in Berlin, and I don't normally sound like I'm in Berlin, but I am in Berlin, and you definitely sound like you are from Germany, even though you're not necessarily living in Germany. Max Schulze: I'm German. Chris Adams: Oh yeah. Maybe it might be worth just briefly touching on how this law kind of manifests in various countries, because I know that like this might be a bit inside baseball, but I've learned from you that Germany was one of the countries that was really pushing quite hard for this energy efficiency law in the first place, and they were one of the first countries who actually kinda write into their own national law.Maybe we could touch a little bit on that before we start talking about world of digital resources and things like that. Max Schulze: Yeah, I think even funnier, and then you always know in the Europe that a certain country's really interested in something, they actually implemented it before the directive even was finalized.So for everybody who doesn't know European policies, so the EU makes directives and then every country actually has to, it's called transpose it, into national law. So just because the EU, it's a very confusing thing, makes something, doesn't mean it's law. It just means that the countries should now implement it, but they don't have to and they can still change it.So what Germany, for example, did, in the directive it's not mandatory to have heat recovery. So we're using the waste heat that comes out of the datacenter. But also the EU did not set release thresholds. But of course Germany was like, "no, we have to be harsher than this." So they actually said, for datacenters above a certain size, that needs to be powered by renewable energy, you need to have heat recovery,it's mandatory for a certain size. And of course the industry is not pleased. So I think we will see a re revision of this, but it was a very ambitious, very strong, "let's manage how they build these things."Chris Adams: I see. Okay. There is a, I think, is there a German phrase? Trust is nice, control is better.Yes. Well, something like that. Yeah. Yeah. Okay. All right. So if I'm just gonna put my program ahead on, so when I think of a directive, it's a little bit like maybe an abstract class, right? Yes. And then if I'm Germany, I'm making a kind of concrete, I've implemented that class in my German law basically.Yes. Max Schulze: Interfaces and implementations. Okay. Chris Adams: Alright. You've explained it into nerd for me. That makes a bit more sense. Thank you for that. Alright, so that's the ED, you kind of, you essentially were there to, to use another German phrase, watch the sausage get made. Yeah. So you've seen how that's turned up and now we have a law in Germany where essentially you've got datacenters regulated in a meaningful way for the first time, for example. Yeah. And we're dealing with all the kind of fallout from all that, for example. And we also spoke a little bit about this idea of digital resources. This is one other thing that you spend quite a lot of intellectual effort and time on helping people develop some of this language themselves and we've used ourselves in some of our own reports when we talk to policy makers or people who don't build datacenters themselves. 'Cause a lot of the time people don't necessarily know what, how a datacenter relates to software and how that relates to maybe them using a smartphone. Maybe you could talk a little about what a digital resource is in this context and why it's even useful to have this language.Max Schulze: Yeah, and let me try to also connect it to the conversation about the ED. I think when, as a developer, you hear transparency and okay, they have to report data. What you're thinking is, "oh, they're gonna have an API where I can pull this information, also, let's say from the inside of the datacenter." Now in Germany, it is also funny for everybody listening, one way to fulfill that because the law was not specific,datacenters now are hanging a piece of paper, I'm not kidding, on their fence with this information, right? So this is like them reporting this. And of course we as, I'm also a software engineer, so we as technical people, what we need is the datacenter to have an API that basically assigns the environmental impact of the entire datacenter to something.And that something has always bothered me that we say, oh, it's the server. Or it's the, I don't know, the rack or the cluster, but ultimately, what does software consume? Software consumes basically three things. We call it compute, network, and storage, but in more philosophical terms, it's the ability to store, process and transfer data.And that is the resource that software consumes. A software does not consume a datacenter or a server. It consumes these three things. And a server makes those things, turns actually energy and a lot of raw materials into digital resources. Then the datacenter in turn provides the shell in which the server can do that function.Right? It's, the factory building is the datacenter. The machine that makes the t-shirts is the server. And the t-shirt is what people wear. Right?Chris Adams: Ah, I see. Okay. So that actually helps when I think about, say, cloud computing. Like when I'm purchasing cloud computing, right, I'm paying for compute. I'm not really that bothered about whether it's an Intel server or something like that.And to a degree, a lot of that is abstracted away from me anyway, so, and there's good sides to that and downsides to that. But essentially that seems to be that idea of kind of like cloud you compute and there being maybe for want of a better term, primitives you build services with, that's essentially some of the language that you are, you've been repurposing for people who aren't cloud engineers, essentially, to understand how modern software gets built these days.Right. Max Schulze: And I think. That's also the real innovation of cloud, right? They gotta give them credit for that. They disaggregated these things. So on. When AWS was first launched, it was S3 for storage, EC2 for compute, and VPC for networks, right? So they basically said like, whatever you need, we will give it to you at scale in infinite pools of however much you need and want, and you pay only for it by the hour.Which before you had to rent a server, the server always came with everything. It came with network, it came with storage, and you had to build the disaggregation yourself. But as a developer, fundamentally all you want, sometimes you just want compute. Now we have LLMs. I definitely just want compute. Then you realize, oh, I also need a lot storage to train an LLM.Then you want some more storage. And then you're like, okay, well I need a massive network inside that, and you can buy each of these pieces by themselves because of cloud. That is really what it is about. Chris Adams: Oh, I see. Okay. And this is why it's little bit can be a bit difficult when you're trying to work out the environmental footprint of something because if we are trying to measure, say a server, but the resources are actually cloud and there's all these different ways you can provide that cloud,then obviously it's gonna be complicated when you try to measure this stuff. Max Schulze: Yeah. Think about a gigabyte of storage on S3. There may be hundreds of servers behind it providing redundancy, providing the control layer, doing monitoring, right? Like in a way that gigabyte of storage is not like a disc inside a server somewhere.It is a system that enables that gigabyte. And on thinking on that, like trying to say the gigabyte needs to come from somewhere is the much more interesting conversation than to go from the server up. Ah. It's misleading otherwise. Chris Adams: Alright. Okay. So. I'm gonna try and use a analogy from say, the energy sector, just to kinda help me understand this because I think there's quite a few key ideas inside this. So in the same way that I am buying maybe units of electricity, like kilowatt hours I'm buying that, I'm not really buying like an entire power station or even a small generator when I'm paying for something. There's all these different ways I can provide it, but really I care about is the resources. And this is the kind of key thing that you've been speaking to policy makers or people who are trying to understand how they should be thinking about datacenters and what they're good for and what they're bound for, right? Yes. Okay. Alright, cool. So you are in Berlin and it's surprisingly sunny today, which is really nice. We've made it through the kind of depressing German winter and I've actually like, you know, you, we've crossed parts quite a few times in the last few weeks because you've been bouncing between where you live in Harlem, Netherlands, and Brussels and Berlin quite a lot.And I like trains and I imagine you like trains, but that's not the only reason you are zipping around here. Are there any projects related to digital sustainability that you could talk about that have been taking up your time, like that you're allowed to talk about these days?Max Schulze: Yeah, I there's a lot.There's too many actually, which is a bit overwhelming. We are doing a lot of work still on software also related to AI and I don't think it's so interesting to go into that. I think everybody from this podcast knows that there's an environmental impact. We now have a lot of tools to measure it, so my work is really focused on how do I get policy makers to act. And one project that I just recently came out and now that the elections are over in Germany, we can also talk about it, is we basically wrote a 200 page monster, call it the German Datacenter, not a strategy yet, it's an assessment and there's a lot of like, how much power are they gonna use?That's not from us. But what we, for the first time we're able to do is to really explain the layers. So there's a lot of misconception that say building a datacenter creates jobs. But I think everybody in software knows that, and I think actually all of you should be more offended when datacenters claim that they are creating jobs because it is always software that runs there that is actually creating the benefit, right?A datacenter building is just an empty building, and what we've been able to explain is to really say, okay, I build a datacenter, then there is somebody bringing servers, running IT infrastructure, maybe a hoster. That hoster in turn provides services to, let's say an agency. That agency creates a website. And that's a really complex system of actors that each add value,and what we've shown is that a datacenter, per megawatt, depending on who's building it, can be three to six jobs. And a megawatt is already a very large datacenter, just can be 10,000 servers. If you compare that to the people on top, like if you go to that agency that can go to up to 300 to 600 jobs per megawatt.And the value creation is really in the software and not anywhere else. And we believe that the German government and all sort of regions, and this applies to any region around the world, should really think like, "okay, if I did, I will build this datacenter, but how do I create that ecosystem around it? You know, in Amsterdam is always a good example.You have Adyen, you have booking.com, you have really big tech companies, and you're like, "I'm sure they're using a Dutch datacenter." Of course not. They're running on AWS in Ireland. So you don't get the ecosystem benefit. But your policy makers think they do, but you don't connect the dots, so to say. Chris Adams: Ah, okay.So if I understand this, so essentially the federal German government, third largest economy, I think it's third or fourth largest economy in the world. Yes. They need to figure out what to do with the fact there's lots and lots of demand for digital infrastructure. They're not quite sure what to do with it, and they also know they have like binding climate goals. So they're trying to work out how to square their circle. And there is also, I mean, most countries right now do wanna have some notion of like being able to kind of economically grow. So they're trying to understand, okay, what role do these play? And a lot of the time there has been a bit of a misunderstanding between what the datacenter provides and where the jobs actually come from.And so you've essentially done for the first time some of this real, actually quite rigorous and open research into, "okay, how do jobs and how is economic opportunity created when you do this? And what happens if you have the datacenter in one place, but the job where the agencies or the startups in another place?" For example, because there seems to be this idea that if you just have a datacenter, you automatically get all the startups and all the jobs and everything in the same place.And that sounds like that might not always be the case without deliberate decisions, right? Max Schulze: Yes. Without like really like designing it that way. And it becomes even more obvious when you look at Hyperscale and cloud providers, where you see these massive companies with massive profits and let's say they go to a region, they come to Berlin,and they tell Berlin, you know, having actually Amazon and Spain also sent a really big press release, like, "we're gonna add 3% to your GDP. We're going to create millions of jobs." And of course every software engineer know is like just building a datacenter for a cloud provider does not do that.And what they're also trying to distract, which we've shown in the report by going through their financial records, is that they don't, they pay property tax, so they pay local tax, in Germany is very low. But they of course, don't pay any corporate income tax in these regions. So the region thinks, "oh, I'm gonna get 10% of the revenue that a company like Microsoft makes."That's not true. And in return, the company ask for energy infrastructure, which is socialized cost, meaning taxpayers pay for this. They ask for land, not always available, or scars. And then they don't really give much back. And that's really, I'm not saying we shouldn't build datacenters or you know, but you have to be really mindful that you need the job creation.The tax creation is something that comes from above this, like on top of a datacenter stack. Yeah. And you need to be deliberate in bringing that all together, like everything else is just an illusion in that sense. Chris Adams: Oh, I see. Okay. So this helps me understand why you place so much emphasis on help helping people understand this whole stack of resources being created and where some of the value might actually be.'Cause it's a little bit like if you are, let's imagine like say you're looking at, say, generating power for example, and you're like, you're opening a power station. Creating a power station by itself isn't necessarily the thing that generates the wealth or it's maybe people being able to use it in some of the higher services, further up the stack as it were.Correct. And that's the kind of framing that you helping people understand so they can have a more sophisticated way of thinking about the role that datacenters play when they advance their economies, for example. Max Schulze: I love that you're using the energy analogy because everybody will hear that, or who's hearing this on the podcast will probably be like, "oh yeah, that's obvious, right?"But for digital it, to a lot of people, it's not so obvious. They think that the power station is the thing, but actually it's the chemical industry next to it that should actually create, that's where the value is created. Chris Adams: I see. Okay. Alright. That's actually quite helpful. So one of the pieces of work you did was actually.Providing new ways to think about how digital infrastructure ends up being, like how it's useful for maybe a country, for example. But one thing that I think you spoke about for some of this report was actually the role that software can actually play in like blunting some of the kind of expected growth in demand for electricity and things like that.And obviously that has gonna have climate implications for example. Can we talk a little bit about the role that designing software in a more thoughtful way actually can blunt some of this expected growth so we can actually hit some of the goals that we had. 'Cause this is something that I know that you spend about fair amount of time thinking about and writing about as well.Max Schulze: Yeah, I think it's really difficult. The measurement piece is key, but having transparency and understanding always helps. What gets measured gets fixed. It's very simple. But the step that comes after that, I think we're currently jumping the gun on that because we haven't measured a lot of stuff. We don't have a public database of say, this SAP system, this Zoom call is using this much.We have very little data to work with and we're immediately jumping through solutions that like, oh, but we, if we shift the workloads, but if we're, for example, workload shifting on cloud, it's, unless the server has turned off, the impact is zero. Or that zero is extreme, but it's very limited because the cloud provider then has an incentive to, to fill it with some other workload.You, it's, we've talked about this before. If everybody sells oil stocks because they're protesting against oil companies, it just means somebody else gonna buy the oil stock. You know? And it ultimately brings them spot prices down. But that's a different conversation. So I think, let's not jump to that.Let's first get measurement really, right? And then it raises to me the question, what's the incentive for big software vendors or companies using software to actually measure and then also publish the results? Because, let's be honest, without public data, we can't do scientific research and even communities like the Green Software Foundation will have a hard time, you know, making report or giving good, making good analysis if we don't have publicly available data on certain software applications.Chris Adams: I see. Okay. This does actually ring some bells 'cause I remember when I was involved in some of the early things related to working out, say software carbon intensity scores. We found that it's actually very, difficult to just get the energy numbers from a lot of services simply because that's not the thing that, 'cause a lot of the time,if you're a company, you might not want to share this 'cause you might consider that as commercially sensitive information. There's a whole separate project called the Real Time Cloud project within the Green Software Foundation where the idea is to, and there's been some progress putting out, say, region by region figures for the carbon intensity of different places you might run cloud in, for example, and this is actually like a step forward, but at best we're finding that we could get maybe the figures for the carbon intensity of the energy that's there, but we don't actually have access to how much power is being used by a particular instance, for example. We're still struggling with this stuff and this is one thing that we keep bumping up against. So I can see where you're coming from there. So, alright, so this is one thing that you've been spending a bit of time thinking through, like where do we go from here then?Max Schulze: Yeah, I think first we need to give ourselves a clap on the back because if you look at the amount of tools that can now do measurement like commercial tools, open source tools, I think it's amazing, right? We have, it's all there. Dashboards, promoters things, report interfaces, you know, it's all there. Now, the next step, and I think that's, as software people, we like to skip that step because we think, well, everybody's now gonna do it.Well, it's not the reality. Now it's about incentives. And I think, for example, one organization we work with is called Seafit and it's a conglomerate of government purchasers, iT purchasers, who say, "okay, we want to purchase sustainable software." And to me it's very difficult to say, and I think you have the same experience, here are the 400 things you should put in your contracts to make the software more sustainable.Instead, what we recommend is to simply say, well, please send me an annual report of all the environmental impacts created from my usage of your software, and very important phrase we always put in this end, please also publish it. Yeah. Again, and I think, right now, that's what we need to focus on. We need to focus on creating that incentive for somebody who's buying, even like Google Workplace, more like notion to really say, "Hey, by the way, before I buy this, I want to see the report," right?I want to see the report from my workplace, and even for all the people listening to this, any service you use, like any API you use commercially, send them just an email and say, "Hey, I'm buying your product. I'm paying 50 euro a month, or 500 or 5,000 euros a month. Can I please get that report? Would you mind?"Yeah. And that creates a whole chain reaction of everybody in the company thinking, "oh my God, all our customers are asking for this." Yeah, we need this. One of our largest accounts wants this figured out. And then they go to the Green Software Foundation or go to all the open source tools.They learn about it, they implement a measurement. Then they realize, "oh, our cloud providers are not giving us data." So then they're sending a letter to all the cloud providers saying like, "guys, can you please provide us those numbers?" Chris Adams: Yeah. Yes. Max Schulze: And this is the chain reaction that requires all of us to focus and act now to trigger.Chris Adams: Okay. So that sounds like, okay. When you, when I first met you, you were looking at, say, how do you quantify this and how do you build some of these measurement tools? And I know that some, there was a German project called, is It SoftAware, which was very, you know, the German take on SoftAware that does try to figure these out to like come up with some meaningful numbers. And now the thing it looks like you're spending some time thinking about is, okay, how do you get organizations with enough clout to essentially write in the level of disclosure that's needed for us to actually know if we're making progress or not?Right? Yeah. Max Schulze: Correct. Little side anecdote on SoftAware. The report is also a 200 page piece. It's been finished for a year and it's not published yet because it's still in review in the, so it's a bit, it's a bit to pain. But fundamentally what we concluded is that, and I, there's other people that have already, while we are writing it, built better tools than we have.And again, research-wise, this topic is, I don't wanna say solved. All the knowledge is out there and it's totally possible. And that's also what we basically set the report. Like if you can attach to the digital resource, if I can attach to the gigabyte of S3 storage, that is highly redundant or less redundant, an environmental product declaration.So how much, physical resources went in it, how much energy went into it, how much water? Then any developer building a software application can basically then do that calculation themselves. If I use 400 gigabytes of search, it's just 400 x what I got environment important for, and that information is still not there.But it's not there because we can't measure it. It's there because people don't want to, like you said, they don't want to have that in public. Chris Adams: Okay. So that's quite an interesting insight that you shared there, is that, 'cause when we first started looking at, I don't know, building digital services,there was a whole thing about saying, well, if my webpage is twice the size, it must have twice the carbon footprint. And there's been a whole debate saying, well actually no, we shouldn't think about that. It doesn't scale that way. And it sounds like you're suggesting yes, you can go down that route where you directly measure every single thing, but in aggregate, if you wanna take a zoom out, if you wanna zoom out to actually achieve some systemic level of change, the thing you might actually need is kind of lower level per primitive kind of allocation of environmental footprint and just say, well, if I know the thing I'm purchasing and building with is say, gigabytes of storage, maybe I should just be thinking about in terms of each gigabyte of storage has this much, so therefore I should just reduce that number rather than worrying too much about if I halve my half, halve the numbers, it's not gonna be precisely a halving in emissions because you're looking at a kind of wider systemic level.Max Schulze: First of all, I never talk about emissions because that's already like a proxy. Again, I think if you take the example of the browser, what you just said, I think there it becomes very obvious, what you really want is HP, Apple, Dell, any laptop they sell, they say, you know, there's 32 gigs of memory per gigabyte of memory.This is the environmental impact per CPU cycle. This is the environmental impact. How easy would it be then to say, well, this browser is using 30% CPU, half of the memory, and then again, assigning it to each tab. It becomes literally just a division and forwarding game mathematically. But the scarcity, that the vendors don't ultimately release it on that level makes it incredibly painful for anyone to kinda reverse engineer and work backwards. Exactly. You get it for the server for the whole thing. Yeah. But that server also, of which configuration was it? Which, how much memory did it have? And this subdivision, that needs to happen.But again, that's a feature that I think we need to see in the measurement game. But I would say, again, slap on the back for all of us and everybody listening, the measurement is good enough. For AI we really see it like, I think for the first time, it is at a scale that everybody's like, it doesn't really matter if we get it 40 or 60% right. It's pretty bad. Yeah. Right. And instead of now saying like, oh, let's immediately move to optimizing the models. Let's first create an incentive that we get all the model makers and then especially those service providers and the APIs, to just give everybody these reports so that we have facts.That's really important to make policy, but also then to have an incentive to get better. Chris Adams: Okay. So look, have a data informed discussion essentially. Alright, so you need data for a data informed discussion basically. Max Schulze: Yes. Chris Adams: Alright. Max Schulze: To add to that, it's really because you like analogies and I like analogiesit's a market that is liquid with information. What I mean by that, if I want to buy a stock of a company, I download their 400 page financial report and it gives me a lot of information about how good that company's doing. Now for software, what are we, what is the liquidity of information in the market?It's, for environmental impact, it's zero. The only liquidity we have is features. There are so many videos for every product on how many features and how to use them. So we have even the financial records of most software companies you can't actually get, 'cause they're private. So we have very scarcity of information and therefore competition in software is all about features.Not about environmental impact. And I'm trying to create information liquidity in the market so that you and I and anybody buying software can make better choices. Chris Adams: Ah, okay. And this helps me understand why, I guess you pointed to there was less that French open example of something equivalent to like word processing.I think we, it should be this French equivalent to like Google Docs. Yeah. Or which is literally called Docs. Yeah. And their entire thing was it's, it looks very much, very similar to some, to the kind of tool you might use for like note taking and everything like that. But because it's on an entirely open stack, it is possible to like see what's happening inside it and understand that, okay, well this is how the impacts scale based on my usage here, for example.Max Schulze: But now. Now one of our friends, Anna, from Green Coding, would say, yeah, you can just run it through my tool and then you see it, but it's still just research information. We need liquidity on the information of, okay, the Ministry of Foreign Affairs in France is using docs. It has 4,000 documents and 3000 active data users.Now that's the where I want the environmental impact data, right? I don't want a lab report. I don't wanna scale it in the lab. I want the real usage data. Chris Adams: Okay. So that feels like some of the next direction we might be moving to is almost looking at some of these things, seeing, like sacrificing some of the precision for maybe higher frequency information at like of things in production essentially.So you can start getting a better idea about, okay, when this is in production or deployed for an entire department, for example, what, how will the changes I make there scale across rather than just making an assumption based on a single system that might not be quite as accurate as the changes I'm seeing in the real world?Max Schulze: And you and I have two different bets on this that go in a different direction. Your bet was very much on sustainability reporting requirements, both CSRD or even financial disclosures. And my bet is if purchasers ask for it, then it will become public. And those are complimentary, but they're bets on the same exact thing. Information liquidity on environmental impact information. Chris Adams: Okay. All right. Well, Max, that sounds, this has been quite fun actually. I've gotta ask just before we wrap up now, if people are curious, and I've found some of the stuff you're talking about, interesting. Where should people be looking if they'd like to learn more?Like is there a website you'd point people to or should they just look up Max Schulze on LinkedIn, for example? Max Schulze: That's always a good idea. If you want angry white men raging about stuff, that's LinkedIn, so you can follow me there. We, the SDIA is now focused on really helping regional governments developing digital ecosystems.So if you're interested in that, go there. If you're interested more in the macro policy work, especially around software, we have launched a new brand that's our think tank now, which is called Leitmotiv. And I'm sure we're gonna include the note, the link somewhere in the notes. Of natürlich. Yeah. Yeah. Very nice.And yeah, I urge you to check that out. We are completely independently funded now. No companies behind us. So a lot of what you read is like the brutal truth and not some kind of washed lobbying positions. So maybe you enjoy reading it. Chris Adams: Okay then. All right, so we've got Leitmotiv, and we've got the SDIA and then just Max Shulzer on LinkedIn.These are the three places to be looking for this sort. Yeah. Alright, Max, it's lovely chatting to you in person and I hope you have a lovely weekend and enjoy some of this sunshine now that we've made it through the Berlin winter. Thanks, Max. Thanks Chris. Hey everyone. Thanks for listening. Just a reminder to follow Environment Variables on Apple Podcasts, Spotify, or wherever you get your podcasts.And please do leave a rating and review if you like what we're doing. It helps other people discover the show. And of course, we'd love to have more listeners. To find out more about the Green Software Foundation, please visit greensoftware.foundation. That's greensoftware.foundation in any browser.Thanks again and see you in the next episode.
undefined
Apr 17, 2025 • 1h 1min

OCP, Wooden Datacentres and Cleaning up Datacentre Diesel

Karl Rabe, founder of WoodenDataCenter and co-lead of the Open Compute Project’s Data Center Facilities group, dives into the fascinating world of sustainable data centers. He discusses how colocating data centers with renewable energy sources like wind farms can slash carbon emissions. Rabe explores the innovative use of cross-laminated timber in construction, highlighting its benefits. He also emphasizes replacing traditional diesel generators with cleaner alternatives and the crucial role of modular, open-source hardware in achieving sustainability and transparency.
undefined
Apr 10, 2025 • 46min

GreenOps with Greenpixie

Host Chris Adams sits down with James Hall, Head of GreenOps at Greenpixie, to explore the evolving discipline of GreenOps—applying operational practices to reduce the environmental impact of cloud computing. They discuss how Greenpixie helps organizations make informed sustainability decisions using certified carbon data, the challenges of scaling cloud carbon measurement, and why transparency and relevance are just as crucial as accuracy. They also discuss using financial cost as a proxy for carbon, the need for standardization through initiatives like FOCUS, and growing interest in water usage metrics.Learn more about our people:Chris Adams: LinkedIn | GitHub | WebsiteJames Hall: LinkedIn Greenpixie: WebsiteFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterNews:The intersection of FinOps and cloud sustainability [16:01]What is FOCUS? Understand the FinOps Open Cost and Usage Specification [22:15]April 2024 Summit: Google Cloud Next Recap, Multi-cloud Billing with FOCUS, FinOps X Updates [31:31]Resources:Cloud Carbon Footprint [00:46]Greenops - Wikipedia [02:18]Software Carbon Intensity (SCI) Specification [05:12]GHG Protocol [05:20]Energy Scores for AI Models | Hugging Face [44:30]What is GreenOps - Newsletter | Greenpixie [44:42]Making Cloud Sustainability Actionable with FinOps Fueling Sustainability Goals at Mastercard in Every Stage of FinOps If you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:James Hall: We want get the carbon data in front of the right people so they can put climate impact as part of the decision making process. Because ultimately, data in and of itself is a catalyst for change. Chris Adams: Hello, and welcome to Environment Variables, brought to you by the Green Software Foundation. In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software.I'm your host, Chris Adams. Hello and welcome to Environment Variables where we explore the developing world of sustainable software development. We kicked off this podcast more than two years ago with a discussion about cloud carbon calculators and the open source tool, Cloud Carbon Footprint, and Amazon's cloud carbon calculator.And since then, the term GreenOps has become a term of art in cloud computing circles when we talk about reducing the environmental impact of cloud computing. But what is GreenOps in the first place? With me today is James Hall, the head of GreenOps at Greenpixie, the cloud computing startup, cloud carbon computing startup,to help me shed some light on what this term actually means and what it's like to use GreenOps in the trenches. James, we have spoken about this episode as a bit of a intro and I'm wondering if I can ask you a little bit about where this term came from in the first place and how you ended up as the def facto head of GreenOps in your current gig.Because I've never spoken to a head of GreenOps before, so yeah, maybe I should ask you that.James Hall: Yeah, well, I've been with Greenpixie right from the start, and we weren't really using the term GreenOps when we originally started. It was cloud sustainability. It was about, you know, changing regions to optimize cloud and right sizing. We didn't know about the FinOps industry either. When we first started, we just knew there was a cloud waste problem and we wanted to do something about it.You know, luckily when it comes to cloud, there is a big overlap between what saves costs and what saves, what saves carbon. But I think the term GreenOps has existed before we started in the industry. I think it, yeah, actually originally, if you go to Wikipedia, GreenOps, it's actually to do with arthropods and Trilobites from a couple million years ago, funnily enough, I'm not sure when it started becoming, you know, green operations.But, yeah, it originally had a connotation of like data centers and IT and devices and I think Cloud GreenOps, where Greenpixie specializes, is more of a recent thing because, you know, it used to be about, yeah, well it is about how do you get the right data in front of the right people so they can start making better decisions, ultimately.And that's kind of what GreenOps means to me. So Greenpixie are a GreenOps data company. We're not here to make decisions for you. We are not a consultancy. We want get the carbon data in front of the right people so they can put climate impact as part of the decision making process. Because ultimately, data in and of itself is a catalyst for change.You know, whether you use this data to reduce carbon or you choose to ignore it, you know, that's up to the organization. But it's all about being more informed, ignoring or, you know, changing your strategy around the carbon data.Chris Adams: Cool. Thank you for that, James. You mentioning Wikipedia and Greenops being all about Trilobites and Arthropods, it makes me realize we definitely should add that to the show notes and that's the thing I'll quickly just do because I forgot to just do the usual intro folks. Yeah, my name's Chris Adams.I am one of the policy director, technology and policy director at the Green Web Foundation, and I'm also the chair of the policy working group inside the Green Software Foundation. All the things that James and I'll be talking about, we'll do our best to judiciously add show notes so you can, you too can look up the origins of, well, the etymology of GreenOps and find out all about arthropods and trilobites and other.And probably a lot more cloud computing as well actually. Okay. Thank you for that James. So you spoke a little and you did a really nice job of actually introducing what Greenpixie does. 'Cause that was something I should have asked you earlier as well. So I have some experience using these tools, like Cloud Carbon Footprint and so on to estimate the environmental impact of digital services. Right. And a lot of the time these things use billing data. So there are tools out there that do already do this stuff. But one thing that I saw that sets Greenpixie apart from some other tools as well, was the actual, the certification process, the fact that you folks have, I think, an ISO 14064 certification.Now, not all of us read over ISO standards for fun, so can you maybe explain why that matters and what that actually, what that changes at all, or even what that certification means? 'Cause, It sounds kind of impressive and exciting, but I'm not quite sure, and I know there are other standards floating around, like the Software Carbon Intensity standard, for example.Like yeah, maybe you could just provide an intro, then see how that might be different, for example.James Hall: Yeah, so ISO 14064 is a kind of set of standards and instructions on how to calculate a carbon number, essentially based on the Greenhouse Gas Protocol. So the process of getting that verification is, you know, you have official auditors who are like certified to give out these certifications, and ultimately they go through all your processes, all your sources, all the inputs of your data, and kind of verify that the outputs and the inputsmake sense. You know, do they align with what the Greenhouse Gas Protocol tells you to do? And, you know, it's quite a, it's a year long process as they get to know absolutely everything about your business and processes, you really gotta show them under the hood. But from a customer perspective, it means you know, that it proves thatthe methodology you're using is very rigorous and it gives them confidence that they can use yours. I think if a company that produces carbon data has an ISO badge, then you can probably be sure that when you put this data in your ESG reports or use it to make decisions, the auditors will also agree with it.'Cause the auditors on the other side, you know, your assurers or from EY and PWC, they'll be using the same set of guidance basically. So it's kind of like getting ahead of the auditing process in the same way, like a security ISO would mean the security that the chief security officer that would need to, you know, check a new vendor that they're about to procure from.If you've got the ISO already, you know they meet our standards for security, it saves me a job having to go and look through every single data processing agreement that they have.Chris Adams: Gotcha. Okay. So there's a few different ways that you can kind of establish trust. And so one of the options is have everything entirely open, like say Cloud Carbon Footprint or OpenCost has a bunch of stuff in the open. There's also various other approaches, like we maintain a library called CO2.js, where we try to share our methodologies there and then one of the other options is certification. That's another source of trust. I've gotta ask, is this common? Are there other tools that have this? 'Cause when I think about some of the big cloud calculators, do you know if they have this, let's say I'm using say, a very, one of the big three cloud providers.Do these have, like today, do you know if they actually have the same certification or is that a thing I should be looking for or I should be asking about if I'm relying on the numbers that I'm seeing from our providers like this.James Hall: Yeah, they actually don't. Well, technically, Azure. Azure's tool did get one in 2020, but you need to get them renewed and reordered as part of the process. So that one's kind of becoming invalid. And I'm not sure AWS or Google Cloud have actually tried, to be honest, but it's quite a funny thought that, you know, it's arguably because this ISO the, data we give you on GCP and AWS is more accurate than the data, or at least more reliable than the data that comes directly out the cloud providers.Chris Adams: Okay. Alright. Let's, make sure we don't get sued. So I'm just gonna stop there before we go any further. But that's like one of the things that it provides. Essentially it's an external auditor who's looked through this stuff. So rather than being entirely open, that's one of the other mechanisms that you have.Okay, cool. So maybe we can talk a little bit more about open source. 'Cause I actually first found out about Greenpixie a few years ago when the Green Software Foundation sent me to Egypt, for COP 27 to try and talk to people about green software. And I won't lie, I mostly got blank looks from most people.You know, they, the, I, there are, people tend to talk about sustainability of tech or sustainability via tech, and people tend not to see them as, most of the time I see people like conflating the two rather than actually realizing no, we're talking about of the technology, not just how it's good for stuff, for example, and he told me, I think one of your colleagues, Rory, was this, yeah.He was telling me a bit about, that Greenpixie was initially using, when you just first started out, you started looking at some tools like Cloud Carbon Footprint as maybe a starting point, but you've ended up having to make various changes to overcome various technical challenges when you scale the use up to like a large, to well, basically on a larger clients and things like that. Could you maybe talk a little bit about some of the challenges you end up facing when you're trying to implement GreenOps like this? Because it's not something that I have direct experience myself. And it's also a thing that I think a lot of people do reach for some open source tools and they're not quite sure why you might use one over the other or what kind of problems they, that they have to deal with when you start processing that, those levels of like billing and usage data and stuff like that.James Hall: I think with the, with cloud sustainability methodologies, the two main issues are things like performance and the data volume, and then also the maintenance of it. 'Cause just the very nature of cloud is you know, huge data sets that change rapidly. You know, they get updated on the hour and then you've also got the cloud providers always releasing new services, new instance types, things like that.So, I mean, like your average enterprises with like a hundred million spend or something? Yeah. Those line items of usage data, if you like, go down to the hour will be billions of rows and terabytes of data. And that is not trivial to process. You know, a lot of the tooling at the moment, including Cloud Carbon Footprint, will try to, you know, use a bunch of SQL queries to truncate it, you know, make it go up to monthly.So you kind of take out the rows by, you know, a factor of 24 times 30 or whatever that is. It's about 740, I think. Something like that (720). Yeah. Yeah. So, and they'll remove things like, you know, there's certain fields in the usage data that will, that are so unique that when you start removing those and truncating it, you're really reducing the size of the files, but you are really losing a lot of that granularity.'Cause ultimately this billing data is to be used by engineers and FinOps people. They use all these fields. So when you start removing fields because you can't handle the data, you're losing a lot of the familiarity of the data and a lot of the usability for the people who need to use it to make decisions.So one of the big challenges is how do you make a processor that can easily handle billions of line items without, you know, falling over. And CCF, one of the issues was the performance really when you start trying to apply it to big data sets. And then on the other side is the maintenance.You know, arguably it's probably not that difficult to make a methodology of a point in time, but you know, over the six months it takes you to create it, it's way out date. You know, they've released a hundred new instance types across the three providers. There's a new type of storage, there's a brand new services, there's new AI models out there.And so now, like Greenpixie's main job is how do we make sure the data is more, we have more coverage of all the skews that come out and we can deliver the data faster and customers have more choices of how to ingest it. So if you give customers enough choice and you give it to them quick enough and it's, you know, covering all of their services, then you know, that's what those, lack of those three things is really what's stopping people from doing GreenOps, I think.Chris Adams: Ah, okay, so one of them was, one of the things you mentioned was just the volume, the fact that you've got, you know, hours multiply the number of different, like a thousand different computers or thousands of computers. That's a lot of data. And then there's a, there's like one of the issues about like the metrics issue, like you, if you wanna provide a simple metric, then you end up losing a lot of data.So that's one of the things you spoke about. And the other one was just the idea of models themselves not being, there's natural cost associated with having to maintain these models. And as far as I'm aware, there aren't, I mean, are there any kind of open sources of models so that you can say, well this is what the figures probably would be for an Amazon EC, you know, 6XL instance, for example.That's the stuff you're talking to when you say the models that you, they're hard to actually up to, hard to keep up to date, and you have to do that internally inside the organization. Is that it?James Hall: Yes, we've got a team dedicated to doing that. But ultimately, like there will always be assumptions in there. 'Cause some of these chip sets you actually can't even get your hands on. So, you know, if Amazon release a new instance type that uses an Intel Xeon 7850C, that is not commercially available.So how do you get your hands on an Intel Xeon 7850B that is commercially available and you're like, okay, it, these six things are similar in terms of performance in hardware. So we're using this as the proxy for the M5 large or whatever it is. And then once you've got the power consumption of those instance types,then you can start saying, okay, this is how we, this is how we're mapping instances to real life hardware. And then that's when you've gotta start being really transparent about the assumptions, because ultimately there's no right answer. All you can do is tell people, this is how we do it. Do you like it?Do you?And you know, over the four years we've been doing this, you know, there's been a lot of trial and error. Actually, right at the start, one of the questions was, what are my credentials? How did I end up as head of GreenOps? I wouldn't have said four years ago I have any credentials to be, you know, a head of GreenOps.So it was a while when I was the only head of GreenOps in the world, according to a Sales Navigator. Why me? But I think it's like, you know, they say if you do 10,000 hours of anything, you kind of, you become good at it. And I wouldn't say I'm a master by any means, but I've made more mistakes and probably tried more things than anybody else over the four years.So, you know, just, from the war stories, I've seen what works. I've seen what doesn't work. And I think that's the kind of, that's the kind of experience people wanna trust. And why Greenpixie made me the head of GreenOps.Chris Adams: Okay. All right. Thanks for that, James. So maybe this is actually a nice segue to talk about a common starting point that lots of people do actually have. So over the last few years, we've also seen people talk about move from not moved away, not just talking about DevOps, but talking about like FinOps.This idea that you might apply kind of some financial thinking to how you purchase and consume, say, cloud services for example. And this tends to, as far as I understand, kinda nudge people towards things like serverless or certain kinds of ways of buying it in a way, which is almost is, you know, very much influenced by fi by I guess the financial sector.And you said before that there's some overlap, but it's not totally over there, it's not, you can't just basically take a bunch of FinOps practices and think it's gonna actually help here. Can we explore that a bit and maybe talk a little bit about what folks get wrong when they try to like map this straight across as if it's the same thing?Please.James Hall: Yeah, so one of the big issues is cost proxies, actually. Yeah, a lot of FinOps as well, how do you fix, or how do you optimize from a cost perspective? What already exists? You know, you've already emitted it. How do you now make it cheaper? The first low hanging fruit that a finance guy trying to reduce their cloud spend would do is things like, you know, buy the instances up front.So you've paid for the full year and now you've been given a million hours of compute.That would might, that might cut your bill in half, but if anything that would drive your usage up, you know, you've got a million hours, you are gonna use them.Chris Adams: Commit to, so you have to commit to then spending a billion. You're like, "oh, great. I have the cost, but now I definitely need to use these." Right?James Hall: Yeah, exactly. And like, yeah, you say commitments. Like I promise AWS I'm gonna spend $2 million, so I'm gonna do whatever it takes to spend that $2 million. If I don't spend $2 million, I'll actually have to pay the difference. So if I only do a million in compute, I'm gonna have to pay a million and get nothing for it.So I'm gonna do as much compute as humanly possible to get the most bang for my back. And I think that's where a lot of the issues is with using costs. Like if you tell someone something's cheap, they're not gonna use less, they're gonna be like, "this looks like a great deal." I'm guilty of it myself. I'll buy clothes I don't need 'cause it's on a clearance sale.You know? And that's kind of how cloud operates. But when you start looking at, when you get a good methodology that really looks at the usage and the nuances between chip sets and storage tiers, you know, there is a big overlap between, you know, cutting the cost from a 2X large to a large that may halve your bill, and it will halve your carbon. And that's the kind of things you need to be looking out for. You need a really nuanced methodology that really looks at the usage more than just trying to use costs.Chris Adams: Okay, so that's one place where it's not so helpful. And you said a little bit like there are some places where it does help, like literally just having the size of the machine is one of the things you might actually do. Now I've gotta ask, you spoke before about like region shifting and stuff, something you mentioned before.Is there any incentive to do anything like that when you are looking at buying stuff in this way? Or is there any kind of, what's the word I'm after, opinion that FinOps or GreenOps has around things like that because as far as I can tell, there isn't, there is very rarely a financial incentive to do anything like that.If anything, it costs, usually costs more to use, maybe say, run something in, say Switzerland for example, compared to running an AWS East, for example. I mean, is that something you've seen, any signs of that where people kind of nudge people towards the greener choice rather than just showing like a green logo on a dashboard for example?James Hall: Well, I mean, this is where GreenOps comes into its own really, because I could tell everyone to move to France or Switzerland, but when you come to each individual cloud environment, they will have policies and approved regions and data sovereignty things, and this is why all you can do is give them the data and then let the enterprise make the decision. But ultimately, like we are working with a retailer who had a failover for storage and compute, but they had it all failing over to one of the really dirty regions, like I think they were based in the UK and they failed over to Germany, but they did have Sweden as one of the options for failover, and they just weren't using it.There's no particular reason they weren't using it, but they had just chosen Germany at one point. So why not just make that failover option Sweden? You know, if it's within the limits of your policies and what you're allowed to do. But, the region switching is completely trivial, unfortunately, in the cloud.So you know, you wouldn't lift and shift your entire environment to another place because there are performance, there are cost implications, but again, it's like how do you add sustainability impact to the trade-off decision? You know, if increasing your cost 10% is worth a 90% carbon reduction for you, great.Please do it if you know the hours of work are worth it for you. But if cost is the priority, where is the middle ground where you can be like, okay, these two regions are the same, they have the same latency, but this one's 20% less carbon. That is the reason I'm gonna move over there. So it's all about, you've already, you can do the cost benefit analysis quite easily, and many people do.But how do you enable them to do a carbon benefit analysis as well? And then once they've got all the data in front of them, just start making more informed decisions. And that's why I think the data is more important than, you know, necessarily telling them what the processes are, giving them the, here's the Ultimate Guide to GreenOps. You know, data's just a catalyst for decisions and if you just need to give them trustworthy data. And then how many use cases does trustworthy data have? You know, how many, how long is a piece of string? I've seen many, but every time there's a new customer, there's new use cases.Chris Adams: Okay, cool. Thank you for that. So, one thing that we spoke before in this kind of pre-call was the fact that, sustainability is becoming somewhat more mainstream. And there's now, within the kind of FinOps foundation or the people who are doing stuff for FinOps are starting to kind of wake up to this and trying to figure out how to incorporate some of this into the way they might kind of operate a team or a cloud or anything like that.And you. I believe you told me about a thing called FOCUS, which is, this is like something like a standardization project across all the FinOps and then, and now there's a sustainability working group, particularly inside this FOCUS group. For people who are not familiar with this, could you tell me what FOCUS is and what this sustainability working group as well working on?You know, 'cause working groups are supposed to work on stuff, right?James Hall: Yeah, so as exactly as you said, FOCUS is a standardization of billing data. So you know, when you get your AWS bill, your Azure bill, they have similar data in them. But they will be completely different column names. Completely different granularities, different column sizes. And so if you're trying to make a master report where you can look at all of your cloud and all of your SaaS bills, you need to do all sorts of data transformations to try and make the columns look the same.You know, maybe AWS has a column that goes one step more granular than Azure, or you're trying to, you know, do a bill on all your compute, but Azure calls it virtual machines. AWS calls it EC2. So you either need to go and categorize them all yourself to make a, you know, a master category that lets you group by all these different things or, you know, thankfully FOCUS have gone and done that themselves, and it started off as a, like a Python script you could run on your own data set to do the transformation for you, but slowly more cloud providers are adopting the FoCUS framework, which means, you know, when you're exporting your billing data, you can ask AWS give me the original or give me a FOCUS one. So they start giving you the data in a way where it's like, I can easily combine all my data sets. And the reason this is super interesting for carbon is because, you know, carbon is a currency in many ways, in the fact that the, Chris Adams: there's price on it in Europe. There's a price on it in the UK. Yeah.James Hall: There's a price on it, but also like the way Azure will present you, their carbon data could be, you know, the equivalent of yen, AWS could be the equivalent of dollars.They're all saying CO2 E, so you might think they're equivalent, but actually they're almost completely different currencies. So this effort of standardization is how do we bring it back? Maybe like, don't give us the CO2 E, but how do we go a few steps before that point and like, how do we start getting similar numbers?So when we wanna make a master report for all the cloud providers, it's apples to apples, not apples to oranges. You know, how do we standardize the data sets to make the reporting, the cross cloud reporting more meaningful for FinOps people?Chris Adams: Ah, I see. Okay. So I didn't realize that the FOCUS stuff has actually listing, I guess like what the, let's, call them primitives, like, you know, compute and storage. Like they all have different names for that stuff, but FOCUS has a kind of shared idea for what the concept of cloud compute, a virtual machine might be, and likewise for storage.So that's the thing you are trying, you're trying to apply, attach a carbon value to in these cases, so you can make some meaningful judgment or so you can present that information to people. James Hall: Yeah, it's about making the reports at the same, but also how do you make the numbers, the source of the numbers more similar? 'Cause currently, Azure may say a hundred tons in their dashboard. AWS may say one ton in their dashboard. You know, the spend and the real carbon could be identical, but it's just the formula behind it is so vastly different that you're coming out with two different numbers.Chris Adams: I see. I think you're referring to at this point here. Some places they might share a number, which is what we refer to as a location based figure. So that's like, what was kind of considered on the ground based on the power intensity from the grid in like a particular part of the world.And then a market based figure might be quite a bit lower. 'Cause you said, well, we've purchased all this green energy, so therefore we are gonna kind of deduct that from what a figure should be. And that's how we'd have a figure of like one versus 100. But if you're not comparing these two together. It's gonna, these are gonna look totally different.And you, like you said, it's not apples. With apples. It's apples with very, yeah. It's something totally different. Okay. That is helpful.James Hall: It gets a lot more confusing than that 'cause it's not just market and location based. Like you could have two location based numbers, but Azure are using the grid carbon intensity annual average from 2020 because that's what they've got approved. AWS may be using, you know, Our World in Data 2023 number, you know, and those are just two different sources for grid intensity.And then what categories are they including? Are they including Scope 3 categories? How many of the scope 2 categories are they including? So when you've got like a hundred different inputs that go into a CO2 number, unless all 100 are the same, you do not have a meaningful comparison between the two.Even location/market based is just one aspect of what goes into the CO2 number, and then where do they get the kilowatt hour numbers from? Is it a literal telemetry device? Or are they using a spend based property on their side? Because that's not completely alien to cloud providers to ultimately rely on spend at the end of the day.So does Azure use spend or does AWS use spend? What type of spend are they using? And that's where you need the transparency as well, because if you don't understand where the numbers come from, it could be the most accurate number in the world, but if they don't tell you everything that went into it, how are you meant to know?Chris Adams: I see. Okay. That's really interesting. 'Cause the Green Web Foundation, the organization I'm part of, there is a gov, there's a UK government group called the Government Digital Sustainability Alliance. And they've been doing these really fascinating lunch and learns andone thing that showed up was when the UK government was basically saying, look, these are, this is the carbon footprint, you know, on a kind of per department level. Like this is what the Ministry of Justice is, or this is what say the Ministry of Defense might be, for example. And that helps explain why you had figures where you had a bunch of people saying the carbon footprint of all these data centers is really high.And then you said they, there were people talking about saying, well, we're comparing this to cloud looks great, but 'cause the figures for cloud are way lower. But the thing they, the thing that I was that people had to caveat that with, they basically said, well, we know that this makes cloud look way more efficient here, and it looks like it's much more, much lower carbon, but because we've only got this final kind of market based figure, we know that it's not a like for like comparison, but until we have that information, we're, this is the best we actually have. And this, is an organization which actually has like legally binding targets. They have to reduce emissions by a certain figure, by a certain date. This does seem like it has to be, I can see why you would need this transparency because it seems very difficult to see how you could meaningfully track your progress towards a target if you don't have access to that.Right?James Hall: Yeah. Well, I always like to use the currency conversion analogy. If you had a dashboard where AWS is all in dollars, Azure, or your on premise is in yen. There's 149 yen in 1 dollar. So, but if you didn't know this one's yen and this one's dollars, you'd be like, "this one's 149 times cheaper. Why aren't we going all in on this one?"But actually it's just different currencies. And they are the same at the end of the day. Under the hood, they're the same. But, know, just the way they've turned it into an accounting exercise has kind of muddied the water, which is why I love electricity metrics more. You know, they're almost like the, non fungible token of, you know, data centers and cloud.'Cause you can use that to calculate location-based. You can use calculate market-based. You can use electricity to calculate water cooling and metrics and things like that. So if you can get the electricity, then you're well on your way to meaningful comparisons.Chris Adams: And that's the one that everyone guards very jealously a lot of the time, right?James Hall: Exactly. Yeah. Well that's directly related to your cost of running business and that is the proprietary information.Chris Adams: I see. Okay. Alright, so we spoke, we've done a bit of a deep dive into the GSG protocol, scope 3, supply chain emissions and things like that. If I may, you mentioned, you, referenced this idea of war stories before. Right. And I. It's surprisingly hard to find people with real world stories about okay, making meaningful changes to like cloud emissions in the world.Do you have any like stories that you've come across in the last four years that you think are particularly worth sharing or that might be worth, I dunno, catch people's attention, for example. Like there's gotta be something that you found that you are allowed to talk about, right.James Hall: Yeah, I mean, MasterCard, one of our Lighthouse customers, they've spoken about the work we're doing with them a lot in, at various FinOps conferences and things like that. But they're very advanced in their GreenOps goals. They have quite ambitious net zero goals and they take their IT sustainability very seriously.Yeah, when we first spoke to them. Ultimately the name of the game was to get the cloud measurement up to the point of their on-premise. 'Cause their on-premise was very advanced, daily electricity metrics with pre-approved, CO2 numbers or CO2 carbon coefficients that multiplied the, you multiply the electricity with.But they were getting, having no luck with cloud, essentially, you know, they spend a lot in the cloud and, but they, they were honestly like, rather than going for just the double wins, which is kind of what most people wanna do, where it's like, I'm gonna use this as a mechanism to save more money.They honestly wanted to do no more harm and actually start making decisions purely for the sustainability benefits. And we kind of went in there with the FinOps team, worked on their FinOps reporting, combined it with their FinOps recommendations and the accountability, which is their tool of choice.But then they started having more use cases around. How do they use our carbon data, not our electricity data from the cloud or like, because we have a big list of hourly carbon coefficients. They wanna use that data to start choosing where they put their on-premise data centers as well, and like really making the sustainability impact a huge factor in where they place their regions, which I think is a very interesting one. 'Cause we had only really focused on how do we help people in their public cloud. But they wanted to align their on-premise reporting with their cloud reporting and ultimately start even making decisions. Okay, I know I need to put a data center in this country.Do I go AWS, Azure, or on-prem for this one? And what is the sustainability impact of all three? And, you know, how do I weigh that against the cost as well? And it's kind of like the golden standard of making sustainability a big part of the trade-off decision. 'Cause they would not go somewhere, even if it saved them 50% of their cost, if it doubled their carbon. They're way beyond that point. So they're a super interesting one. And even in public sector as well, like the departments we are working with are relatively new to FinOps and they didn't really have like a proper accountability structure for their cloud bill. But when you start adding carbon data to it, you are getting a lot more eyes onto the, onto your bills and your usage.And ultimately we help them create that more of a FinOps function just with the carbon data. 'Cause people find carbon data typically more interesting than spend data. But if you put them on the same dashboard, now it's all about how do you market efficient usage? And I think that's one of the main, use cases of GreenOps is to get more eyes or more usage.So, 'cause the more ideas you've got piling in, the more use cases you find and.Chris Adams: Okay. Alright, so we spoke, so you spoke about carbon as one of the main things that people are caring about, right. And we're starting to develop more of an awareness that maybe some data centers might themselves be exposed to kind of climate risks themselves. Because I know they were built on a floodplain, for example.And you don't want a data center on a floodplain in the middle of a flood, for example. Right. but there's also like the flip side, you know, that's too much water. But there are cases where people worry about not enough water, for example. I mean, is that something that you've seen people talk about more of?Because there does seem to be a growing awareness about the water footprint of digital infrastructure as well now. Is that something you're seeing people track or even try to like manage right now?James Hall: Well, we find that water metrics are very popular in the US more so than the CO2 metrics, and I think it's because the people there feel the pain of lack of water. You know, you've got the Flint water crisis. In the UK, we've got an energy crisis stopping people from building homes. So what you really wanna do is enable the person who's trying to use this data to drive efficiency, to tell as many different stories asis possible,. You know, the more metrics and the more choice they have of what to present to the engineers and what to present to leadership, the better outcomes they're gonna get. Water is a key one because data centers and electricity production uses tons of water. And the last thing you wanna do is, you know, go to a water scarce area and put a load of servers in there that are gonna guzzle up loads of water. One, because if that water runs out, your whole data center's gonna collapse. So it's, you're exposing yourself to ESG risk. And also, you know, it doesn't seem like the right thing to do. There are people trying to live there who need to use that water to live.But you know, you've got data centers sucking that water out, so you know, can't you use this data to again, drive different decisions, could invoke an emotional response that helps people drive different decisions or build more efficiently. And if you're saving cost at the end of that as well, then everyone's happy.Chris Adams: So maybe this is actually one thing we can talk about because, or just like, drill into before we kind of, move on to the next question and wrap up. So we, people have had incentives to track cost and cash for obvious reasons, carbon, as you're seeing more and more laws actually have opinions about carbon footprint and being able to report that people are getting a bit more aware of it.Like we've spoken about things like location based figures and market based figures. And we have previous episodes where we've explored and actually kind of helped people define those terms. But I feel comfortable using relatively technical terminology now because I think there is a growing sophistication, at least in certain pockets, for example.Water still seems to be a really new one, and it seems to be very difficult to actually have, find access to meaningful numbers. Even just the idea of like water in the first place. Like you, when you hear figures about water being used, that might not be the same as water. Kind of.It's not, it might not be going away, so it can't be used. It might be returned in a way that is maybe more difficult to use or isn't, or is sometimes it's cleaner, sometimes it's dirtier, for example. But this, it seems to be poorly understood despite being quite an emotional topic. Have you, yeah, what's your experience been like when people try to engage with this or when you try to even find some of the numbers to present to people and dashboards and things?James Hall: Yeah. So yeah, surprisingly, all the cloud providers are able to produce factors. I think it's actually a requirement that when you have a data center, you know what the power usage effectiveness is, so what the overhead electricity is, and you know what the water usage effectiveness is. So you know, what is your cooling system, how much water does it use, how much does it withdraw?Then how much does it actually consume? So the difference between withdrawal and consumption, is withdrawal is you let you take clean water out, you're able to put clean water back relatively quickly. Consumption is you have either poisoned the water with some kind of, you know, you've diluted it or you know, with some kind of coolant that's not fit for human consumption or you've now evaporated it.And there is some confusion sometimes around "it's evaporated, but it'll rain. It'll rain back down." But, you know, a lake's evaporation and redeposition processs is ike a delicate balance. If it, you know, evaporates 10,000 liters a day and rains 10,000 liters a day after, like a week of it going into the clouds and coming back down the mountain nearby.If you then have a data center next to it that will accelerate the evaporation by 30,000 leases a day, you really upset the delicate balance that's in there and that, you know, you talk about are these things sustainable? Like financial sustainability is, do you have enough money and income to last a long time, or will your burn rate run out next month?And it's the same with, you know, sustainability. I think fresh water is a limiting resource in the same way a company's bank balance is their limiting resource. There's a limited amount of electricity, there's a limited amount of water out there. I think it was the cEO of Nvidia. I saw a video of him on LinkedIn that said, right now the limit to your cloud environment is how much money you can spend on it.But soon it will be how much electricity is there? You know, you could spend a trillion dollars, but if there's no more room for electricity, there's no more electricity to be produced, then you can't build anymore data centers or solar farms. And then water's the other side of that.I think water's even worse because we need water to even live. And you know what happens when there's no more water because the data centers have it. I think it invokes a much more emotional response. When you have good data that kind of is backed by good sources, you can tell an excellent story of why you need to start reducing.Chris Adams: Okay, well hopefully we can see more of those numbers because it seems like it's something that is quite difficult to get access to at the moment. Water's it, water in particular. Alright, so we're coming to time now and one thing we spoke about in the prep call was talking about the GSG protocol.We did a bit but nerd like nerding into this and you spoke a little bit about yes, accuracy is good, but you can't just only focus on accuracy if you want someone to actually use any of the tools or you want people to adopt stuff, and you said that in the GHG protocol, which is like the gold standard for people working out kind of the, you know, carbon footprint of things.You said that there were these different pillars inside of that matter. And if you just look at accuracy, that's not gonna be enough. So can you maybe expand on that for people who maybe aren't as familiar with the GSG protocol as you? Because I think there is something that, I think, that there, there's something there that's worth, I think, worth exploring.James Hall: Yeah. So it just as a reminder for those out there, the pillars are accuracy, yes, completeness, consistency, transparency, and relevance. A lot of people worry a lot about the accuracy, but, you know, just to give an example that if you had the most amazing, accurate number for your entire cloud environment, you know, 1,352 tons 0.16 grams, but you are one engineer under one application, running a few resources, the total carbon number is completelyuseless to you, to be honest. Like how do you make, use that number to make a decision for your tiny, you know, maybe five tons of information. So really you've got to balance all of these things. You know, the transparency is important because you need to build trust in the data. People need to understand where it comes from.The relevance is, you know, again, are you filtering on just the resources that are important to me? And the consistency touches on, aWS is one ton versus Azure is 100 tons. You can't decide which cloud provider to go into based on these numbers because you know, they're marking their own homework. They've got a hundred different ways to calculate these things. And then the completeness is around, if you're only doing compute, but 90% is storage, you are missing out on loads of information. You know, you could have a super accurate compute for Azure, but if you've got completely different numbers for AWS and you dunno where they come from, you've not got a good data set, a good GreenOps data set to be able to drive decisions or use as a catalyst.So you really need to prioritize all five of these pillars in an equal measure and treat them all as a priority rather than just go for full accuracy.Chris Adams: Brilliant. We'll sure make a point of sharing a link to that in the show notes for anyone else who wants to dive into the world of pillars of sustainability reporting, I suppose. Alright. Okay. Well, James, I think that takes us to time. So just before we wrap up, there's gonna be usual things like where people can find you, but are there any particular projects that are catching your eye right now that you are kind of excited about or you'd like to direct people's attention to? 'Cause we'll share a link to the company you work for, obviously, and possibly yourself on LinkedIn or whatever it is. But is there anything else that you've seen in the last couple of weeks that you find particularly exciting in the world of GreenOps or kind of the wider sustainable software field? James Hall: Yeah, I mean, a lot of work being done around AI sustainability is particularly interesting. I recommend people go and look at some of the Hugging Face information around which models are more electrically efficient. And from a Greenpixie side, we've got a newsletter now for people wanting to learn more about GreenOps and in fact, we're building out a GreenOps training and certification that I'd be very interested to get a lot of people's feedback on.Chris Adams: Cool. Alright, well thank you one more time. If people wanna find you on LinkedIn, they would just look up James Hall Greenpixie, presumably right? Or something like that.James Hall: Yeah, and go to our website as well.Chris Adams: Well James, thank you so much for taking me along to this deep dive into the world of GreenOps ,cloud carbon reporting and all the, and the rest. Hope you have a lovely day and yeah. Take care of yourself mate. Cheers.James Hall: Thanks so much, Chris.  Hey everyone, thanks for listening. Just a reminder to follow Environment Variables on Apple Podcasts, Spotify, or wherever you get your podcasts. And please do leave a rating and review if you like what we're doing. It helps other people discover the show, and of course, we'd love to have more listeners.Chris Adams: To find out more about the Green Software Foundation, please visit greensoftware.foundation. That's greensoftware.foundation in any browser. Thanks again, and see you in the next episode. 
undefined
Apr 3, 2025 • 44min

The Week in Green Software: Data Centers, AI and the Nuclear Question

Christopher Liljenstolpe, Senior Director for Data Center Architecture and Sustainability at Cisco, shares his expertise on the energy demands of AI-driven data centers. He discusses the potential role of nuclear power in sustainable tech and the advantages of small modular reactors. The conversation also touches on the importance of efficient design for AI infrastructure and the unforeseen role of internet infrastructure during the pandemic. Chris highlights how collaboration between hardware and software sectors can drive innovation in green technology.
undefined
Mar 27, 2025 • 12min

Backstage: Green Software Patterns

In this episode, Chris Skipper takes us backstage into the Green Software Patterns Project, an open-source initiative designed to help software practitioners reduce emissions by applying vendor-neutral best practices. Guests Franziska Warncke and Liya Mathew, project leads for the initiative, discuss how organizations like AVEVA and MasterCard have successfully integrated these patterns to enhance software sustainability. They also explore the rigorous review process for new patterns, upcoming advancements such as persona-based approaches, and how developers and researchers can contribute. Learn more about our people:Chris Skipper: LinkedIn | WebsiteFranziska Warncke: LinkedInLiya Mathew: LinkedInFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterResources:Green Software Patterns | GSF [00:23]GitHub - Green Software Patterns | GSF [ 05:42] If you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:Chris Skipper: Welcome to Environment Variables, where we bring you the latest news from the world of sustainable software development. I am the producer of the show, Chris Skipper, and today we're excited to bring you another episode of Backstage, where we uncover the stories, challenges, and innovations driving the future of green software.In this episode, we're diving into the Green Software Patterns Project, an open source initiative designed to curate and share best practices for reducing software emissions.The project provides a structured approach for software practitioners to discover, contribute, and apply vendor-neutral green software patterns that can make a tangible impact on sustainability. Joining us today are Franziska Warncke and Liya Mathew, the project leads for the Green Software Patterns Initiative.They'll walk us through how the project works, its role in advancing sustainable software development, and what the future holds for the Green Software Patterns. Before we get started, a quick reminder that everything we discuss in this episode will be linked in the show notes below. So without further ado, let's dive into our first question about the Green Software Patterns project. My first question is for Liya. The project is designed to help software practitioners reduce emissions in their applications.What are some real world examples of how these patterns have been successfully applied to lower carbon footprints?Liya Mathew: Thanks for the question, and yes, I am pretty sure that there are a lot of organizations as well as individuals who have greatly benefited from this project. A key factor behind the success of this project is the impact that these small actions can have on longer runs. For example, AVEVA has been an excellent case of an organization that embraced these patterns.They created their own scoring system based on Patterns which help them measure and improve their software sustainability. Similarly, MasterCard has also adopted and used these patterns effectively. What's truly inspiring is that both AVEVA and MasterCard were willing to share their learnings with the GSF and the open source community as well.Their contributions will help others learn and benefit from their experiences, fostering a collaborative environment where everyone can work towards a more sustainable software.Chris Skipper: Green software patterns must balance general applicability with technical specificity. How do you ensure that these patterns remain actionable and practical across different industries, technologies and software architectures?Liya Mathew: One of the core and most useful features of patterns is the ability to correlate the software carbon intensity specification. Think of it as a bridge that connects learning and measurement. When we look through existing catalog of patterns, one essential thing that stands out is their adaptability.Many of these patterns not only align with sustainability, but also coincide with security and reliability best practices. The beauty of this approach is that we don't need to completely rewrite our software architecture to make it more sustainable. Small actions like catching static data or providing a dark mode can make significant difference.These are simple, yet effective steps that can lead us a long way towards sustainability. Also, we are nearing the graduation of Patterns V1. This milestone marks a significant achievement and we are already looking ahead to the next exciting phase: Patterns V2. In Patterns V2, we are focusing on persona-based and behavioral patterns, which will bring even more tailored and impactful solutions to our community.These new patterns will help address specific needs and behaviors, making our tools even more adaptable and effective.Chris Skipper: The review and approval process for new patterns involves multiple stages, including subject matter expert validation and team consensus. Could you walk us through the workflow for submitting and reviewing patterns?Liya Mathew: Sure. The review and approval process for new patterns involve multiple stages, ensuring that each pattern meets a standard before integration. Initially, when a new pattern is submitted, it undergoes an initial review by our initial reviewers. During this stage, reviewers check if the pattern aligns with the GSF's mission of reducing software emissions, follows the GSF Pattern template, and adheres to proper formatting rules. They also ensure that there is enough detail for the subject matter expert to evaluate the pattern. If any issue arises, the reviewer provides clear and constructive feedback directly in the pull request, and the submitter updates a pattern accordingly.Once the pattern passes the initial review, it is assigned to an appropriate SME for deeper technical review, which should take no more than a week, barring any lengthy feedback cycles. The SME checks for duplicate patterns validates the content as assesses efficiency and accuracy of the pattern in reducing software remission.It also ensures that the pattern's level of depth is appropriate. If any areas are missing or incomplete, the SME provides feedback in the pull request. If the patterns meet all the criteria, SME will then remove the SME review label and adds a team consensus label and assigns this pull request back to the initial reviewer.Then the Principles and Patterns Working Group has two weeks to comment or object to the pattern, requiring a team consensus before the PR can be approved and merged in the development branch. Thus the raw process ensures that each pattern is well vetted and aligned with our goals.Chris Skipper: For listeners who want to start using green software patterns in their projects, what's the best way to get involved, access the catalog, or submit a new pattern?Liya Mathew: All the contributions are made via GitHub pull requests. You can start by submitting a pull request on our repository. Additionally, we would love to connect with everyone interested in contributing. Feel free to reach out to us on LinkedIn or any social media handles and express your interest in joining our project's weekly calls.Also, check if your organization is a member of the Green Software Foundation. We warmly welcome contributions in any capacity. As mentioned earlier, we are setting our sights on a very ambitious goal for this project, and your involvement would be invaluable.Chris Skipper: Thanks to Liya for those great answers. Next, we had some questions for Franziska. The Green Software Patterns project provides a structured open source database of curated software patterns that help reduce software emissions. Could you give us an overview of how the project started and its core mission? Franziska Warncke: Great question. The Green Software Patterns project emerged from a growing recommendation of the environmental impact of software and the urgent need for sustainable software engineering practices. As we've seen the tech industry expand, it became clear that while hardware efficiency has been a focal point for sustainability, software optimization was often overlooked. A group of dedicated professionals began investigating existing documentation, including resources like the AWS Well-Architected Framework, and this exploration laid to groundwork for the project. This allows us to create a structured approach to the curating of the patterns that can help reduce software emissions.We developed a template that outlines how each pattern should be presented, ensuring clarity and consistency. Additionally, we categorize these patterns into the three main areas, cloud, web, and AI. Chris Skipper: Building an open source knowledge base and ensuring it remains useful, requires careful curation and validation. What are some of the biggest challenges your team has faced in developing and maintaining the green software patterns database? Franziska Warncke: Building and maintaining an open source knowledge base like the Green Software Patterns database, comes with its own set of challenges. One of the biggest hurdles we've encountered is resource constraints. As an open source project, we often operate with limited time personnel, which makes it really, really difficult to prioritize certain tasks over others.Despite this challenge, we are committed to continuous improvement, collaboration, and community engagement to ensure that the Green Software Patterns database remains a valuable resource for developers looking to adopt more sustainable practices.Chris Skipper: Looking ahead, what are some upcoming initiatives for the project? Are there any plans to expand the pattern library or introduce new methodologies for evaluating and implementing patterns? Franziska Warncke: Yes, we have some exciting initiatives on the horizon. So one of our main focuses is to restructure the patterns catalog to adopt the persona-based approach. This means we want to create tailored patterns for various worlds within the software industry, like developers, project managers, UX designers, and system architects.By doing this, we aim to make the patents more relevant and accessible to a broader audience. We are also working on improving the visualization of the patterns. We recognize that user-friendly visuals are crucial for helping people understand and adopt these patterns in their own projects, which was really missing before.In addition to that, we plan to categorize the patterns based on different aspects. Such as persona type, adoptability and effectiveness. This structured approach will help users quickly find the patterns that are most relevant to their roads and their needs, making the entire experience much more streamlined. Moreover, we are actively seeking new contributors to join us.And we believe that the widest set of voices and perspective will enrich our knowledge base and ensure that our patterns reflect a wide range of experience. So, if anyone is interested, we'd love to hear from you. Chris Skipper: The Green Software Patterns Project is open source and community-driven. How can developers, organizations, and researchers contribute to expanding the catalog and improving the quality of the patterns?Franziska Warncke: Yeah, the Green Software Patterns Project is indeed open source and community driven, and we welcome contributions from developers, organizations, and researchers to help expand our catalog and improve the quality of the patterns. We need people to review the existing patterns critically and provide feedback.This includes helping us categorize them for a specific persona, ensuring that each pattern is tailored to each of various roles in the software industry. Additionally, contributors can assist by adding more information and context to the patterns, making them more comprehensive and useful. Visuals are another key area where we need help.Creating clear and engaging visuals that illustrate how to implement these patterns can significantly enhance their usability. Therefore, we are looking for experts who can contribute their skills in design and visualization to make the patterns more accessible. So if you're interested, then we would love to have you on board.Thank you.Chris Skipper: Thanks to Franziska for those wonderful answers. So we've reached the end of the special backstage episode on the Green Software Patterns Project at the GSF. I hope you enjoyed the podcast. To listen to more podcasts about green software, please visit podcast.greensoftware.foundation. And we'll see you on the next episode.Bye for now.​ 
undefined
Mar 20, 2025 • 50min

The Week in Green Software: Sustainable AI Progress

For this 100th episode of Environment Variables, guest host Anne Currie is joined by Holly Cummins, senior principal engineer at Red Hat, to discuss the intersection of AI, efficiency, and sustainable software practices. They explore the concept of "Lightswitch Ops"—designing systems that can easily be turned off and on to reduce waste—and the importance of eliminating zombie servers. They cover AI’s growing energy demands, the role of optimization in software sustainability, and Microsoft's new shift in cloud investments. They also touch on AI regulation and the evolving strategies for balancing performance, cost, and environmental impact in tech. Learn more about our people:Chris Adams: LinkedIn | GitHub | WebsiteHolly Cummins: LinkedIn | GitHub | WebsiteFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterNews:AI Action Summit: Two major AI initiatives launched | Computer Weekly [40:20]Microsoft reportedly cancels US data center leases amid oversupply concerns [44:31]Events:Data-driven grid decarbonization - Webinar | March 19, 2025The First Eco-Label for Sustainable Software - Frankfurt am Main | March 27, 2025 Resources:LightSwitchOps Why Cloud Zombies Are Destroying the Planet and How You Can Stop Them | Holly CumminsSimon Willison’s Weblog [32:56]The GoalIf you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:Holly Cummins: Demand for AI is growing, demand for AI will grow indefinitely. But of course, that's not sustainable. Again, you know, it's not sustainable in terms of financially and so at some point there will be that correction. Chris Adams: Hello, and welcome to Environment Variables, brought to you by the Green Software Foundation. In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software.I'm your host, Chris Adams. Anne Currie: So hello and welcome to Environment Variables, where we bring you the latest news and updates from the world of sustainable software. Now, today you're not hearing the dulcet tones of your usual host, Chris Adams. I am a guest host on this, a common guest, a frequent guest host, Anne Currie. And my guest today is somebody I've known for quite a few years and I'm really looking forward to chatting to, Holly.So do you want to introduce yourself, Holly?Holly Cummins: So I'm Holly Cummins. I work for Red Hat. My day job is that, I'm a senior principal engineer and I'm helping to develop Quarkus, which is Java middleware. And I'm looking at the ecosystem of Quarkus, which sounds really sustainability oriented, but actually the day job aspect is I'm more looking atthe contributors and, you know, the extensions and that kind of thing. But one of the other things that I do end up looking a lot at is the ecosystem aspect of Quarkus in terms of sustainability. Because Quarkus is a extremely efficient Java runtime. And so when I joined the team, one of the things we asked well, one of the things I asked was, can we, know this is really efficient. Does that translate into an environmental, you know, benefit? Is it actually benefiting the ecosystem? You know, can we quantify it? And so we did that work and we were able to sort of validate our intuition that it did have a much lower carbon footprint, which was nice.But some things of what we did actually surprised us as well, which was also good because it's always good to be challenged in your assumptions. And so now part of what I'm doing as well is sort of broadening that focus from, instead of measuring what we've done in the past, thinking about, well, what does a sustainable middleware architecture look like?What kind of things do we need to be providing?Anne Currie: Thank you very much indeed. That's a really good overview of what I really primarily want to be talking about today. We will be talking about a couple of articles as usual on AI, but really I want to be focused on what you're doing in your day job because I think it's really interesting and incredibly relevant.So, as I said, my name is Anne Currie. I am the CEO of a learning and development company called Strategically Green. We do workshops and training around building green software and changing your systems to align with renewables. But I'm also one of the authors of O'Reilly's new book, Building Green Software, and Holly was probably the most, the biggest single reviewer/contributor to that book, and it was in her best interest to do so because, we make, I make tons and tons of reference to a concept that you came up with.I'm very interested in the backstory to this concept, but perhaps you can tell me a little bit more about it because it is, this is something I've not said to you before, but it is, this comes up in review feedback, for me, for the book, more than any other concept in the book. Lightswitch Ops. People saying, "Oh, we've put in, we've started to do Lightswitch Ops."If anybody says "I've started to do" anything, it's always Lightswitch Ops. So tell us, what is Lightswitch Ops?Holly Cummins: So Lightswitch Ops, it's really, it's about architecting your systems so that they can tolerate being turned off and on, which sounds, you know, it sounds sort of obvious, but historically that's not how our systems have worked. And so the first step is architect your system so that they can tolerate being turned off and on.And then the next part is once you have that, actually turn them off and on. And, it sort of, it came about because I'm working on product development now, and I started my career as a performance engineer, but in between those two, I was a client facing consultant, which was incredibly interesting.And it was, I mean, there was, so many things that were interesting, but one of the things that I sort of kept seeing was, you know, you sort of work with clients and some of them you're like, "Oh wow, you're, you know, you're really at the top of your game" and some you think, "why are you doing this way when this is clearly, you know, counterproductive" or that kind of thing.And one of the things that I was really shocked by was how much waste there was just everywhere. And I would see things like organizations where they would be running a batch job and the batch job would only run at the weekends, but the systems that supported it would be up 24/7. Or sometimes we see the opposite as well, where it's a test system for manual testing and people are only in the office, you know, nine to five only in one geo and the systems are up 24 hours.And the reason for this, again, it's sort of, you know, comes back to that initial thing, it's partly that we just don't think about it and, you know, that we're all a little bit lazy, but it's also that many of us have had quite negative experiences of if you turn your computer off, it will never be the same when it comes back up.I mean, I still have this with my laptop, actually, you know, I'm really reluctant to turn it off. But now we have, with laptops, we do have the model where you can close the lid and it will go to sleep and you know that it's using very little energy, but then when you bring it back up in the morning, it's the same as it was without having to have the energy penalty of keeping it on overnight. And I think, when you sort of look at the model of how we treat our lights in our house, nobody has ever sort of left a room and said, "I could turn the light off, but if I turn the light off, will the light ever come back on in the same form again?"Right? Like we just don't do that. We have a great deal of confidence that it's reliable to turn a light off and on and that it's low friction to do it. And so we need to get to that point with our computer systems. And you can sort roll with the analogy a bit more as well, which is in our houses, it tends to be quite a manual thing of turning the lights off and on.You know, I turn the light on when I need it. In institutional buildings, it's usually not a manual process to turn the lights off and on. Instead, what we end up is, we end up with some kind of automation. So, like, often there's a motion sensor. So, you know, I used to have it that if I would stay in our office late at night, at some point if you sat too still because you were coding and deep in thought, the lights around you would go off and then you'd have to, like, wave your arms to make the lights go back on.And it's that, you know, it's this sort of idea of like we can detect the traffic, we can detect the activity, and not waste the energy. And again, we can do exactly this our computer systems. So we can have it so that it's really easy to turn them off and on. And then we can go one step further and we can automate it and we can say, let's script to turn things off at 5pm because we're only in one geo.And you know, if we turn them off at 5pm, then we're enforcing quite a strict work life balance. So...Anne Currie: Nice, nice work.Holly Cummins: Yeah. Sustainable. Sustainable pace. Yeah. Or we can do sort of, you know, more sophisticated things as well. Or we can say, okay, well, let's just look at the traffic and if there's no traffic to this, let's turn it off.off Anne Currie: Yeah, it is an interestingly simple concept because it's, when people come up with something which is like, in some ways, similar analogies, a light bulb moment of, you know, why don't people turn things off? Becasue, so Holly, everybody is an unbelievably good public speaker.One of the best public speakers out there at the moment. And we first met because you came and gave talks at, in some tracks I was hosting on a variety. Some on high performance code, code efficiency, some on, being green. One of the stories you told was about your Lightswitch moment, the realization that actually this was a thing that needed to happen.And I thought it was fascinating. It was about how, I know everybody, I've been in the tech industry for a long time, so I've worked with Java a lot over the years and many years ago. And one of the issues with Java in the old days was always, it was very hard to turn things off and turn them back on again.And that was fine in the old world, but you talked about how that was no longer fine. And that was an issue with the cloud because the cloud, using the cloud well, turning things on and off and things, doing things like auto scaling is utterly key to the idea of the cloud. And therefore it had to become part of Quarkus, part of the future of Java. Am I right in that understanding? Holly Cummins: Yeah, absolutely. And the cloud sort of plays into both parts of the story, actually. So definitely we, the things that we need to be cloud native, like being able to support turning off and on again, are very well aligned to what you need to support Lightswitch Ops. And so the, you know, there with those two, we're pulling in the same direction.The needs of the cloud and the needs of sustainability are both driving us to make systems that, I just saw yesterday, sorry this is a minor digression, but I was looking something up, and we used to talk a lot about the Twelve-Factor App, and you know, at the time we started talking about Twelve-Factor Apps, those characteristics were not at all universal. And then someone came up with the term, the One-Factor App, which was the application that could just tolerate being turned off and on.And sometimes even that was like too much of a stretch. And so there's the state aspect to it, but then there's also the performance aspect of it and the timeliness aspect of it. And that's really what Quarkus has been looking at that if you want to have any kind of auto scaling or any kind of serverless architecture or anything like that, the way Java has historically worked, which is that it eats a lot of memory and it takes a long time to start up, just isn't going to work.And the sort of the thing that's interesting about that is quite often when we talk about optimizing things or becoming more efficient or becoming greener, it's all about the trade offs of like, you know, "oh, I could have the thing I really want, or I could save the world. I guess I should save the world." But sometimes what we can do is we can just find things that we were paying for, that we didn't even want anymore. And that's, I think, what Quarkus was able to do. Because a lot of the reason that Java has a big memory footprint and a lot of the reason that Java is slow to start up is it was designed for a different kind of ops.The cloud didn't exist. CI/CD didn't exist. DevOps didn't exist. And so the way you built your application was you knew you would get a release maybe once a year and deployment was like a really big deal. And you know, you'd all go out and you'd have a party after you successfully deployed because it was so challenging.And so you wanted to make sure that everything you did was to avoid having to do a deployment and to avoid having to talk to the ops team because they were scary. But of course, even though we had this model where releases happen very rarely, or the big releases happen very rarely, of course, the world still moves on, you know, people still had defects, people, so what you ended up with was something that was really much more optimized towards patching.So can we take the system and without actually taking, turning it off and on, because that's almost impossible, can we patch it? So everything was about trying to change the engine of the plane while the plane was flying, which is really clever engineering. If you can support that, you know, well done you.It's so dynamic. And so everything was optimized so that, you know, you could change your dependencies and things would keep working. And, you know, you could even change some fairly important characteristics of your dependencies and everything would sort of adjust and it would ripple back through the system.But because that dynamism was baked into every aspect of the architecture, it meant that everything just had a little bit of drag, and everything had a little bit of slowdown that came from that indirection. And then now you look at it in the cloud and you think, well, wait a minute. I don't need that. I don't need that indirection.I don't need to be able to patch because I have a CI/CD pipeline, and if I'm going into my production systems and SSHing in to change my binaries, something has gone horribly wrong with my process. And you know, I need to, I have all sorts of problems. So really what Quarkus was able to do was get rid of a whole bunch of reflection, get rid of a whole bunch of indirection,do more upfront at build time. And then that gives you much leaner behavior at runtime, which is what you want in a cloud environment.Anne Currie: Yeah. And what I love about this and love about the story of Quarkus is, it's aligned with something, non functional requirements. It's like, it's an unbelievably boring name, and for something which is a real pain point for companies. But it's also, in many ways, the most important thing and the most difficult thing that we do.It's like, being secure, being cost effective, being resilient. A lot of people say to me, well, you know, actually all you're doing with green is adding another non functional requirement. We know those are terrible. But I can say, no, we need to not make it another non functional requirements. It's just a good, another motivator for doing the first three well, you know. Also scaling is about resilience. It's about cost saving, and it's about being green. And it's about, and being able to pave rather than patch, I think is, was the term. It's more secure, you know. Actually patching is much less secure than repaving, taking everything down and bringing it back up.All the modern thinking about being more secure, being faster, being cheaper, being more resilient is aligned or needs to be aligned with being green and it can be, and it should be, and it shouldn't just be about doing less.Holly Cummins: Absolutely. And, you know, especially for the security aspect, when you look at something like tree shaking, that gives you more performance by getting rid of the code that you weren't using. Of course, it makes you more secure as well because you get rid of all these code paths and all of these entry points and vulnerabilities that had no benefit to you, but were still a vulnerability.Anne Currie: Yeah, I mean, one of the things that you've talked about Lightswitch Ops being related to is, well, actually not Lightswitch Ops, but the thing that you developed before Lightswitch Ops, the concept of zombie servers. Tell us a little bit about that because that not only is cost saving, it's a really big security improvement.So tell us about zombie, the precursor to Lightswitch Ops.Holly Cummins: Yeah, zombie servers are again, one of those things that I sort of, I noticed it when I was working with clients, but I also noticed it a lot in our own development practices that what we would do was we would have a project and we would fire up a server in great excitement and you know, we'd register something on the cloud or whatever.And then we'd get distracted and then, or then we, you know, sometimes we would develop it but fail to go to production. Sometimes we'd get distracted and not even develop it. And I looked and I think some of these costs became more visible and more obvious when we move to the cloud, because it used to be that when you would provision a server, once it was provisioned, you'd gone through all of the pain of provisioning it and it would just sit there and you would keep it in case you needed it.But with the cloud, all of a sudden, keeping it until you needed it had a really measurable cost. And I looked and I realized, you know, I was spending, well, I wasn't personally spending, I was costing my company thousands of pounds a month on these cloud servers that I'd provisioned and forgotten about.And then I looked at how Kubernetes, the sort of the Kubernetes servers were being used and some of the profiles of the Kubernetes servers. And I realized that, again, there's, each company would have many clusters. And I was thinking, are they really using all of those clusters all of the time?And so I started to look into it and then I realized that there had been a lot of research done on it and it was shocking. So again, you know, the sort of the, I have to say I didn't coin the term zombie servers. I talk about it a lot, but, there was a company called the Antithesis Institute.And what they did, although actually, see, now I'm struggling with the name of it because I always thought they were called the Antithesis Institute. And I think it's actually a one letter variant of that, which is much less obvious as a word, but much more distinctive. But I've, every time I talked about them, I mistyped it.And now I can't remember which one is the correct one, but in any case, it's something like the Antithesis Institute. And they did these surveys and they found that, it was something like a third of the servers that they looked at were doing no work at all. Or rather no, no useful work. So they're still consuming energy, but there's no work being done.And when they say no useful work as well, that sounds like a kind of low bar. Because when I think about my day job, quite a lot of it is doing work that isn't useful. But they had, you know, it wasn't like these servers were serving cat pictures or that kind of thing. You know, these servers were doing nothing at all.There was no traffic in, there was no traffic out. So you can really, you know, that's just right for automation to say, "well, wait a minute, if nothing's going in and nothing's coming out, we can shut this thing down." And then there was about a further third that had a utilization that was less than 5%.So again, you know, this thing, it's talking to the outside world every now and then, but barely. So again, you know, it's just right for a sort of a consolidation. But the, I mean, the interesting thing about zombies is as soon as you talk about it, usually, you know, someone in the audience, they'll turn a little bit green and they'll go, "Oh, I've just remembered that server that I provisioned."And sometimes, you know, I'm the one giving the talk and I'm like, Oh, while preparing this talk, I just realized I forgot a server, because it's so easy to do. And the way we're measured as well, and the way we measure our own productivity is we give a lot more value to creating than to cleaning up.Anne Currie: Yeah. And in some ways that makes sense because, you know, creating is about growth and cleaning up you know, it's about degrowth. It's about like, you know, it's like you want to tell the story of growth, but I've heard a couple of really interesting, sales on zombie servers since you started, well, yeah, since you started talking about it, you may not have invented it, but you popularized it. One was from, VMware, a cost saving thing. They were, and it's a story I tell all the time about when they were moving data centers in Singapore, setting up a new data center in Singapore.They decided to do a review of all their machines to see what had to go across. And they realized that 66 percent of their machines did not need to be reproduced in the new data center. You know, they had a, and that was VMware. People who are really good at running data centers. So imagine what that's like.But moving data centers is a time when it often gets spotted. But I will say, a more, a differently disturbing story from a company that wished to remain nameless. Although I don't think they need to because I think it's just an absolutely bog standard thing. They were doing a kind of thriftathon style thing of reviewing their data center to see if there was stuff that they could save money on, and they found a machine that was running at 95, 100 percent CPU, and they thought, they thought, Oh my God, it's been hacked.It's been hacked. Somebody's mining Bitcoin on this. It's, you know, or maybe it's attacking us. Who knows? And so they went and they did some searching around internally, and they found out that it was somebody who turned on a load test, and then forgot to turn it off three years previously. And And the, I would say that obviously that came up from the cost, but it also came up from the fact that machine could have been hacked.You know, it could be, could have been mining Bitcoin. It could have been attacking them. It could have been doing anything. They hadn't noticed because it was a machine that no one was looking at. And I thought it was an excellent example. I thought those two, excellent examples of the cost and the massive security hole that comes from machines that nobody is looking at anymore.So, you know, non functional requirements, they're really important. AndHolly Cummins: Yeah.Anne Currie: doing better on them is also green. And also, they're very, non functional requirements are really closely tied together.Holly Cummins: Yeah. I mean, oh, I love both of those stories. And I've heard the VMware one before, but I hadn't heard the one about the hundred percent, the load test. That is fantastic. One of the reasons I like talking about zombies and I think one of the reasons people like hearing about it I mean, it's partly the saving the world.But also I think when we look at greenness and sustainability, some of it is not a very cheerful topic, but the zombie servers almost always when you discover the cases of them, they are hilarious. I mean, they're awful, but they're hilarious And you know, it's just this sort of stuff of, "how did this happen?How did we allow this to happen?" Sometimes it's so easy to do better. And the examples of doing bad are just something that we can all relate to. And, but on the same time, you know, you sort of think, oh, that shouldn't have happened. How did that happen?Anne Currie: But there's another thing I really like about zombie servers, and I think you've pointed out this yourself, and I plagiarized from your ideas like crazy in Building Green Software, which is one of the reasons why I got you to be a reviewer, so you could complain about it if you wanted to early on. The, Holly Cummins: It also means I would agree with you a lot. Yes. Oh This is very, sensible. Very sensible. Yes.Anne Currie: One of the things that we, that constantly comes up when I'm talking to people about this and when we're writing the book and when we're going out to conferences, is people need a way in. And it's often that, you know, that people think the way into building green software is to rewrite everything in C and then they go, "well, I can't do that.So that's the end. That's the only way in. And I'm not going to be able to do it. So I can't do anything at all." Operations and zombie servers is a really good way in, because you can just do it, you can, instead of having a hackathon, you can just do a thrift a thon, get everybody to have a little bot running that doesn't need to be running, instantly halve your, you know, it's not uncommon for people to find ways to halve their life.Yeah. carbon emissions and halve their hosting costs simultaneously in quite a short period of time and it'd be the first thing they do. So I quite like it because it's the first thing they do. What do you think about that? It's, is it the low hanging fruit?Holly Cummins: Yeah, absolutely, I think, yeah, it's the low hanging fruit, it's easy, it's, kind of entertaining because when you find the problems you can laugh at yourself, and there's, again, there's no downside and several upsides, you know, so it's, you know, it's this double win of I got rid of something I wasn't even using, I have more space in my closet, and I don't have to pay for it.Anne Currie: Yeah, I just read a book that I really should have read years and years ago, and I don't know why I didn't, because people have been telling me to read it for years, which was the goal. Which is, it's not about tech, but it is about tech. It's kind of the book that was the precursor to the Phoenix Projects, which I think a lot read.And it was, it's all about TPS, the Toyota Production System. In a kind of like an Americanized version of it, how are the tires production system should be brought to America. And it was written in the 80s and it's all about work in progress and cleaning your environment and getting rid of stuff that gets in your way and just obscures everything., you can't see what's going on. Effectively, it was a precursor to lean, which I think is really very well aligned. Green and lean, really well aligned. And, it's something that we don't think about, that cleaning up waste just makes your life much better in ways that are hard to imagine until you've done it.And zombie, cleaning zombie servers up just makes your systems more secure, cheaper, more resilient, more everything. It's a really good thing to do.Holly Cummins: Yeah. And there's sort of another way that those align as well, which I think is interesting because I think it's not necessarily intuitive. Which is, sometimes when we talk about zombie servers and server waste, people's first response is, this is terrible. The way I'm going to solve it is I'm going to put in barriers in place so that getting a server is harder.And that seems really intuitive, right? Because it's like, Oh yes, we need to solve it. But of course, but it has the exact opposite effect. And again it seems so counterintuitive because it seems like if you have a choice between shutting the barn door before the horses left and shutting the barn door after the horses left, you should shut the barn door before the horses left.But what happens is that if those barriers are in place, once people have a server, if they had to sweat blood to get that server, they are never giving it up. It doesn't matter how many thriftathons you do, they are going to cling to that server because it was so painful to get. So what you need to do is you need to just create these really sort of low friction systems where it's easy come, easy go.So it's really easy to get the hardware you need. And so you're really willing to give it up and that kind of self service model, that kind of low friction, high automation model is really well aligned again with lean. It's really well aligned with DevOps. It's really well aligned with cloud native.And so it has a whole bunch of benefits for us as users as well. If it's easier for me to get a server, that means I'm more likely to surrender it, but it also means I didn't have to suffer to get it, which is just a win for me personally. Anne Currie: It is. And there's something at the end of the goal in the little bit at the end, which I thought was my goodness, the most amazing, a bit of a lightswitch moment for me, when it was talking to this still about 10 years ago, but it was, it's talking about, ideas about stuff that, basically underpin the cloud, underpin modern computing, underpin factories and also warehouses and because I worked for a long time in companies that had warehouses, so you kind of see that there are enormous analogies and it was talking about how a lot of the good modern practice in this has been known since the 50s.And, it, even in places like japan, where it's really well known, I mean, Toyota is so, the Toyota production system is so well managed, almost everybody knows it, and everybody wants to, every company in Japan wants to be operating in that way. Still, the penetration of companies that actually achieve it is very low, it's only like 20%.I thought, it's interesting, why is that? And then I realised that you'd been kind of hinting why it was throughout. And if you look on the Toyota website, they're quite clear about it. They say the Toyota production system is all about trial and error. Doesn't matter, you can't read a book that tells you what we did, and then say, "oh well if I do that, then I will achieve the result."They say it's all about a culture of trial and error. And then you achieve, then you build something which will be influenced by what we do, and influenced by what other people do, and influenced by a lot of these ideas. But fundamentally, it has to be unique to you because anything complicated is context-specific.Therefore, you are going to have to learn from it. But one of the, one of the key things for trial and error is not making it so hard to try something and so painful if you make an error that you never do any trial and error. And I think that's very aligned with what you were saying about if you make it too hard, then nobody does any trial and error.Holly Cummins: Yeah. Absolutely.Anne Currie: I wrote a new version of it, called the cloud native attitude, which was all about, you know, what are people doing? You know, what's the UK enterprise version of the TPS system, and what are the fundamentals and what are people actually doing?And what I realized was that everybody was doing things that were quite different, that was specific to them, that used some of the same building blocks and were quite often in the cloud because that reduced their bottlenecks over getting hardware. Because that's always, that's a common bottleneck for everybody.So they wanted to reduce the bottleneck there of getting the access to hardware. But what they were actually doing was built trial and error wise, depending on their own specific context. And every company is different and has a different context. And, yeah, so you have to be able to, that is why failure is so, can't be a four letter word.Holly Cummins: Yeah. Technically, it's a seven letter word if you say failure, but...Anne Currie: And it should be treated that way.Yeah.  I'm very aware that actually our brief for this was to talk about three articles on AI.Holly Cummins: I have to say, I did have a bit of a panic when I was reviewing the articles because they were very deep into the sort of the intricacies of, you know, AI policy and AI governance, which is not my specialty area.Anne Currie: No, neither is it mine. All that and when I was reading it, I thought quite a lot about what we've just talked about. It is a new area. It's something that, as far as AI is concerned, I love AI. I have no problem with AI. I think it's fantastic. It's amazing what it can produce.And if you are not playing around on the free version of ChatGPT, then you are not keeping on top of things because it changes all the time. And it's, very like managing somebody. You get out of it what you put in. If you put in, if you make a very cursory, ask it a couple of cursory questions, you'll get a couple of cursory answers.If you, you know, leaning back on Toyota again, you almost need to five wise it. You need to No, go, no, but why? Go a little bit deeper. Now go a little bit deeper. Now go a little bit deeper. And then you'll notice that the answers get better and better, like a person, better and better.So if you, really do, it is worth playing around with it. Holly Cummins: Just on that, I was just reading an article from Simon Willison this morning and he, was talking about sort of, you know, a similar idea that, you know, you have to put a lot into it and that to get good, he was talking about it for coding assistance that, you know, to get good outputs, it's not trivial.And a lot of people will sort of try it and then be disappointed by their first result and go, "Oh, well, it's terrible" and dismiss it. But he was saying that one of the mistakes that people make is to anthropomorphize it. And so when they see it making mistakes that a human would never make, they go, "well, this is terrible" and they don't think about it in terms of, well, this has some weaknesses and this has some strengths and they're not the same weaknesses and strengths as a person would have.And so I can't just see this one thing that a human would never do and then dismiss it. I, you know, you need to sort of adapt how you use it for its strengths and weaknesses, which I thought was really interesting. The sort of the, you know, it's so tempting to anthropomorphize it because it is so human ish in its outputs because it's trained on human inputs, but it is not, it does not have the same strengths and weaknesses as a person.Anne Currie: Well, I would say the thing is, it can be used in lots of different ways. There are ways you can use it which, actually, it can react like a person, and therefore does need to be called. I mean, if you ask it to do creative things, it's quite human like. And it will come up with, and it will blag, and it will, you know, it's, you just have to treat it to certainly, certain creative things.You have to go, "is that true?" Can you double check that? Is that, I appreciate your enthusiasm there, but it might not be right. Can you just double check that? In the same way that you would do for, with a very enthusiastic graduate. And you wouldn't have fired them because they said something that seemed plausibleand, well, unless you'd said, do not tell me anything that seems plausible, then you don't double check. Because to a certain extent, they're always enthused. And that's where ideas come from. Stretching what's saying, well, you know, I don't know if this is happening, but this could happen. You have to be a little bit out there to generate new ideas and have new thoughts. I heard a very interesting podcast yesterday where one of the Reeds, I can never remember if it was Reed Hastings or Reed Hoffman, you know, it's like it was talking about AI, it was AI energy use.And he was saying, we're not stupid, you know, if there's, basically, there are two things that we know are coming. One is AI and one is climate change. We're not going to build, to try and create an AI industry that's requires the fossil fuel industry because that would be crazy talk, you know, we do all need to remember that climate change is coming and it is a different model for how, and, you know, if you are building an AI system that relies on fossil fuels, then you are an idiot because, the big players are not. You know, it's, I love looking at our world in data and looking at what is growing in the world?And if you look to a chart that's really interesting to look at, if you ever feel depressed about climate change is to look at the global growth in solar power in solar generated power. It's going up like it's not even exponential. It's, you know, it's, it looks vertically asymptotic.You know, it's super exponential. It's going faster than exponential, nothing else is developing that way. Except maybe AI, but AI from a from a lower point and, actually I think the AI will, and then you've got things with AI, you've got stuff like DeepSeek that's coming out of field and saying, "do you know?You just didn't need to write this so inefficiently. You could, you know, you could do this on a lot less, and it'd be a lot cheaper, and you could do things on the edge that you didn't know that you could do." So, yeah, I'm not too worried about AI. I think that DeepSeek surprised me.Holly Cummins: Yeah, I agree. I think we have been seeing this, you know, sort of enormous rise in energy consumption, but that's not sustainable, and it's not sustainable in terms of climate, but it's also not sustainable financially. And so financial corrections tend to come before the climate corrections.And so what we're seeing now is architectures that are designed to reduce the energy costs because they need to reduce the actual financial costs. So we get things like DeepSeek where there's the sort of fundamental efficiency in the model of the architecture or the architecture of the model rather.But then we're also seeing things as well, like you know, up until maybe a year ago, the way it worked was that the bigger the model, the better the results. Just, you know, absolutely. And now we're starting to see things where the model gets bigger. And the results get worse and you see this with RAG systems as well, where when you do your RAG experiment and you feed in just two pages of data, it works fantastically well and then you go, "okay, I'm going to proceed."And then you feed in like 2000 pages of data and your RAG suddenly isn't really working and it's not really giving you correct responses anymore. And so I think we're seeing an architectural shift away from the really big monolithic models to more orchestrated models. Which is kind of bad in a way, right?Because it means we as engineers have to do more work. We can't just like have one big monolith and say, "solve everything." But on the other hand, what do engineers love? We love engineering. So it means that there's opportunities for us. So, you know, a pattern that we're seeing a lot now is that you have your sort of orchestrator model that takes the query in and triages it.And it says, "is this something that should go out to the web? Because, actually, like, that's the best place for this news topic. Or is this something that should go to my RAG model? Is this something..." You know, and so it'll choose the right model. Those models are smaller, and so they have a much more limited scope.But, within that scope, they can give you much higher quality answers than the huge supermodel, and they cost much less to run. So you end up with a system, again, it's about the double win, where you have a system which maybe took a little bit more work to architect, but gives you better answers for a lower cost. Anne Currie: That is really interesting and more aligned as well with how power is being developed potentially, you know, that there is, that you really want to be doing more stuff at the edge, which that you want, and you want people to be doing stuff at home on their own devices, you know, rather than just always having to go to, as you say, Supermodels are bad.We all disapprove of supermodels. Holly Cummins: Yeah. and in terms of, you know, that aligns with some of the sort of the, you know, the privacy concerns as well, which is, you know, people want to be doing it at home and certainly organizations want to be keeping their data in house. And so then that means that they need the more organization local model to be keeping their, dirty secrets in house.Anne Currie: Well, it is true. I mean, the thing is you, it is very hard to keep things secure and sometimes just do want to keep things in house, some of your data in house, you don't necessarily even want to stick it on Amazon if you can avoid it. But yes, so that's been a really interesting discussion and we have completely gone off topic and we've hardly talked at all about, the AI regulation.I think we both agree that AI regulation, it's quite soon to be doing it. It's interesting. I can see why, the Americans have a tendency to take a completely different approach to the EU. If you look at their laws and I have to, I did do some lecturing in AI ethics and legalities and American laws do tend to be like, well, something goes wrong, you know, get your pantsuit off and fix it. EU laws tend to be about, don't even, don't do it. You know, as you said before, close the door before the horse has, you know, has bolted. And the American law is about bringing it back.But in some ways, that is, that exemplifies why America grows much faster than Europe does. , Holly Cummins: Yeah.I was, when I was looking at some of the announcements that did come out of the AI summit, I think, yeah, I have really mixed feelings about it because I think I generally feel that regulation is good, but I also agree with you that it can have a stifling effect on growth, but one thing that I think is fairly clearly positive that did seem to be emphasized in the announcements as well is the open source aspect.So, like, we're, I mean, we have, you know, sort of open source models now, but they're not as open source as, you know, open source software in terms of how reproducible they are, how accessible they are for people to see the innards of, but I think I was thinking a little bit again when I was sort of the way the AI summit isis making these sort of bodies that have like the public private partnerships, which isn't anything new, but you know, we're sort of seeing quite a few governments coming together. So like the current AI announcement, I think had nine governments and dozens of companies, but it reminded me a little bit of the sort of the birth of radio. When we had this resource which was the airwaves, the frequencies that, you know, had, nobody had cared about. And then now all of a sudden it was quite valuable and there was potentially, you know, the sort of wild west of like, okay, who can take this and exploit it commercially? And then government stepped in and said, "actually, no, this is a resource that belongs to all of us.And so it needs to be managed." Who has access to it and who can just grab it. And I feel a bit like, even though in a technical sense, the data all around us isn't all of ours. It's, you know, a lot of it is copyrighted and that kind of thing. But if you look at the sort of aggregate of like all of the data that humanity has produced, that is a collective asset.And so it should be that how it gets used is for a collective benefit and that regulation, and making sure that it's not just one or two organizations that have the technical potential to leverage that data is a collectively good thing.Anne Currie: Especially at the moment, we don't want everything to be happening in the US, because, maybe the US is not the friendly partner that we would always thought it would be, it's, diversityHolly Cummins: diversity is good. Diversity of geographic interests.Anne Currie: Indeed. Yeah, it is. So yeah, it's, but it is early days. I'm not an anti AI person by any stretch. In fact, I love AI. I think it's really is an amazing thing. And we just need to align it with the interests of the rest of the humanity in termsHolly Cummins: Yes.Anne Currie: but it is interesting. They're saying that in terms of being green, the big players are not idiots. They know that things need to be aligned. But in terms of data, they certainly will be acting in their best interests. So, yeah, I can see they, yeah, indeed. Very interesting. So, we are now coming to time, we've done quite a lot, we've done quite a lot. There won't be much to edit out from what we've talked about today.I think it's great, it's very good. But, Holly Cummins: Shall we talk about the Microsoft article though? Cause that, I thought that was really interesting.Anne Currie: oh yeah, go for it, Yes,Holly Cummins: Yeah, so one of the other articles that we have is, It said that Microsoft had, was reducing its investment in data centers, which was, I was quite shocked to read that because it's the exact opposite of all of the news articles that we normally see, including one I saw this morning that said that, you know, the big three are looking at increasing their investment in nuclear.But I thought it was sort of interesting because we've, I think we always tend to sort of extrapolate from the current state and extrapolate it indefinitely forward. So we say demand for AI is growing, demand for AI will grow indefinitely, but of course, that's not sustainable. Again you know, it's not sustainable in terms of financially and so at some point there will be that correction and it seems like, Microsoft has perhaps looked at how much they've invested in data centers and said "oh, perhaps this was a little bit much, perhaps let's rollback that investment just a little bit, because now we have an over capacity on data centers."Anne Currie: Well, I mean, I wonder how much of DeepSeek had an effect on which is that everybody was looking at it and going, the thing is, I mean, Azure is, it's, not, well, I say this is a public story. So I could, because I have it in the book, the story of during the pandemic, the team, the Microsoft Teams folks looking at what they were doing and saying, "could this be more efficient?" And the answer was yes, because had really no effort in whatsoever to make what they were doing efficient. Really basic efficiency stuff they hadn't done. And so there was tons of waste in that system. And the thing is, when you gallop ahead to do things, you do end up with a lot of waste.DeepSeek was a great example of, you know this AI thing, we can do it on like much cheaper chips and much fewer machines. And you don't have to do it that way. So I'm hoping that this means that Microsoft have decided to start investing in efficiency. It's a shame because they used to have an amazing team who were fantastic at this kind of stuff, who used it, so we, I was saying, Holly spoke at a conference I did last year about code efficiency. And Quarkus being a really good example of a more efficient platform for running Java on. The first person I had on that used to work for Azure. And he used to, was probably the world's expert in actual practical code efficiency. He got made redundant. Yeah. Because, Microsoft at the time were not interested in efficiency. So "who cares? Pfft, go on, out." But he's now working at NVIDIA doing all the efficiency stuff there. Because some people are not, who paying attention to, I, well I think the lesson there is that maybe Microsoft were not paying that much attention to efficiency, the idea that actually you don't need 10 data centers. A little bit of easy, well, very difficult change to make it really efficient. But quite often there's a lot of low hanging fruit in efficiency.Holly Cummins: Absolutely. And you need to remember to do it as well, because I think that, I think probably it is a reasonable and correct flow to say, innovate first, optimize second. So, you know, you, don't have be looking at that efficiency as you're innovating because that stifles the efficiency and you know, you might be optimizing something that never becomes anything, but you have to then remember once you've got it out there to go back and say, "Oh, look at all of these low hanging fruit. Look how much waste there is here. Let's, sort it out now that we've proven it's a success."Anne Currie: Yeah. Yeah, it is. Yes. It's like "don't prematurely optimize does" not mean "never optimize."Holly Cummins: Yes. Yes.Anne Currie: So, I, my strong suspicion is that Microsoft are kind of waking up to that a little bit. The thing is, if you have limitless money, and you just throw a whole load of money at things, then, it is hard to go and optimize. As you say, it's a bit like that whole thing of going in and turning off those zombie machines.You know, you have to go and do it know, it's, you have to choose to do it. If you have limitless money, you never do it, because it's a bit boring, it's not as exciting as a new thing. Yeah, but yeah, limitless money has its downsides as well as up.Holly Cummins: Yes. Who knew?Anne Currie: Yeah, but so I think we are at the end of our time. Is there anything else you want to say before you, it was an excellent hour.Holly Cummins: Nope. Nope. This has been absolutely fantastic chatting to you Anne.Anne Currie: Excellent. It's been very good talking to you as always. And so my final thing is, if anybody who's listening to this podcast has not read building green software from O'Reilly, you absolutely should, because a lot of what we just talked about was covered in the book. Reviewed by Holly.Holly Cummins: I can recommend the book.Anne Currie: I think your name is somewhere as a, some nice thing you said about it somewhere on the book cover, but, so thank you very much indeed. And just a reminder to everybody, everything we've talked about all the links in the show notes at the bottom of the episode. And, we will see, I will see you again soon on the Environment Variables podcast.Goodbye. Chris Adams: Hey everyone, thanks for listening. Just a reminder to follow Environment Variables on Apple Podcasts, Spotify, or wherever you get your podcasts. And please do leave a rating and review if you like what we're doing. It helps other people discover the show, and of course, we'd love to have more listeners.To find out more about the Green Software Foundation, please visit greensoftware.foundation. That's greensoftware.foundation in any browser. Thanks again, and see you in the next episode. 
undefined
Mar 6, 2025 • 57min

AI Energy Measurement for Beginners

Host Chris Adams is joined by Charles Tripp and Dawn Nafus to explore the complexities of measuring AI's environmental impact from a novice’s starting point. They discuss their research paper, A Beginner's Guide to Power and Energy Measurement and Estimation for Computing and Machine Learning, breaking down key insights on how energy efficiency in AI systems is often misunderstood. They discuss practical strategies for optimizing energy use, the challenges of accurate measurement, and the broader implications of AI’s energy demands. They also highlight initiatives like Hugging Face’s Energy Score Alliance, discuss how transparency and better metrics can drive more sustainable AI development and how they both have a commonality with eagle(s)! Learn more about our people:Chris Adams: LinkedIn | GitHub | WebsiteDawn Nafus: LinkedInCharles Tripp: LinkedInFind out more about the GSF:The Green Software Foundation Website Sign up to the Green Software Foundation NewsletterNews:The paper discussed: A Beginner's Guide to Power and Energy Measurement and Estimation for Computing and Machine Learning [01:21] Measuring the Energy Consumption and Efficiency of Deep Neural Networks: An Empirical Analysis and Design Recommendations [13:26]From Efficiency Gains to Rebound Effects: The Problem of Jevons' Paradox in AI's Polarized Environmental Debate | Luccioni et al [45:46]Will new models like DeepSeek reduce the direct environmental footprint of AI? | Chris Adams [46:06]Frugal AI Challenge [49:02] Within Bounds: Limiting AI's environmental impact [50:26]Events:NREL Partner Forum Agenda | 12-13 May 2025Resources:Report: Thinking about using AI? - Green Web Foundation | Green Web Foundation [04:06]Responsible AI | Intel [05:18] AIEnergyScore (AI Energy Score) | Hugging Face [46:39]AI Energy Score [46:57]AI Energy Score - Submission Portal - a Hugging Face Space by AIEnergyScore [48:23]AI Energy Score - GitHub [48:43] Digitalisation and the Rebound Effect - by Vlad Coroama (ICT4S School 2021) [51:11]The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural NetworksBUTTER-E - Energy Consumption Data for the BUTTER Empirical Deep Learning Dataset [51:44]OEDI: BUTTER - Empirical Deep Learning Dataset [51:49]GitHub - NREL/BUTTER-Better-Understanding-of-Training-Topologies-through-Empirical-ResultsBayesian State-Space Modeling Framework for Understanding and Predicting Golden Eagle Movements Using Telemetry Data (Conference) | OSTI.GOV [52:26]Stochastic agent-based model for predicting turbine-scale raptor movements during updraft-subsidized directional flights - ScienceDirect [52:46]Stochastic Soaring Raptor Simulator [53:58]NREL HPC Eagle Jobs Data [55:02]Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI AIAAIC | The independent, open, public interest resource detailing incidents and controversies driven by and relating to AI, algorithms and automationIf you enjoyed this episode then please either:Follow, rate, and review on Apple PodcastsFollow and rate on SpotifyWatch our videos on The Green Software Foundation YouTube Channel!Connect with us on Twitter, Github and LinkedIn!TRANSCRIPT BELOW:Charles Tripp: But now it's starting to be like, well, we can't build that data center because we can't get the energy to it that we need to do the things we want to do with it. we haven't taken that incremental cost into account over time, we just kind of ignored it. And now we hit like the barrier, right? Chris Adams: Hello, and welcome to Environment Variables, brought to you by the Green Software Foundation. In each episode, we discuss the latest news and events surrounding green software. On our show, you can expect candid conversations with top experts in their field who have a passion for how to reduce the greenhouse gas emissions of software.I'm your host, Chris Adams. Welcome to Environment Variables, where we bring you the latest news and updates from the world of sustainable software development. I'm your host Chris Adams. If you follow a strict media diet, you switch off the Wi-Fi on your house and you throw your phone into the ocean, you might be able to avoid the constant stream of stories about AI in the tech industry. For the rest of us, though, it's basically unavoidable. So having an understanding of the environmental impact of AI is increasingly important if you want to be a responsible practitioner navigating the world of AI, generative AI, machine learning models, DeepSeek, and the rest. Earlier this year, I had a paper shared with me with the intriguing title A Beginner's Guide to Power and Energy Measurement, an Estimation for Computing and Machine Learning. And it turned out to be one of the most useful resources I've since come across for making sense of the environmental footprint of AI. So I was over the moon when I found out that two of the authors were both willing and able to come on to discuss this subject today. So joining me today are Dawn Nafus and Charles Tripp, who worked on the paper and did all this research. And well, instead of me introducing them, well, they're right here. I might as well let them do the honors themselves, actually. So, I'm just going to work in alphabetical order. Charles, I think you're slightly ahead of Dawn. So, if I, can I just give you the room to, like, introduce yourself?Charles Tripp: Sure. I'm a machine learning researcher and Stanford algorithms researcher, and I've been programming pretty much my whole life since I was a little kid, and I love computers. I researched machine learning and reinforcement learning in particular at Stanford, started my own company, but kind of got burnt out on it.And then I went to the National Renewable Energy Lab where I applied machine learning techniques to energy efficiency and renewable energy problems there. And while I was there, I started to realize that computing energy efficiency was a risingly, like, an increasingly important area of study on its own.So I had the opportunity to sort of lead an effort there to create a program of research around that topic. And it was through that work that I started working on this paper, made these connections with Dawn. And I worked there for six years and just recently changed jobs to be a machine learning engineer at Zazzle.I'm continuing to do this research. And, yeah. Chris Adams: Brilliant. Thank you, Charles. Okay, so national, that's NREL that some people referCharles Tripp: That's right. It's one of the national labs. Chris Adams: Okay. Brillinat. And Dawn, I guess I should give you the space to introduce yourself, and welcome back again, actually. Dawn Nafus: Thank you. Great to be here. My name is Dawn Nafus. I'm a principal engineer now in Intel Labs. I also run the Socio Technical Systems Lab. And I also sit on Intel's Responsible AI Advisory Council, where we look after what kinds of machine learning tools and products do we want to put out the door. Chris Adams: Brilliant, thank you, Dawn. And if you're new to this podcast, I mentioned my name was Chris Adams at the beginning of the podcast. I work at the Green Web Foundation. I'm the director of technology and policy there. I'm one of the authors of a report all about the environmental impact of AI last year, so I have like some background on this. I also work as the policy chair in the Green Software Foundation Policy Working Group as well. So that's another thing that I do. And if you, if there, we'll do our best to make sure that we link to every single paper and project on this, so if there are any particular things you find interesting, please do follow, look for the show notes. Okay, Dawn, I'm, let's, shall we start? I think you're both sitting comfortably, right? Shall I begin?Okay, good. So, Dawn, I'm really glad you actually had a chance to both work on this paper and share and let me know about it in the first place. And I can tell when I read through it, there was quite an effort to, like, do all the research for this.So, can I ask, like, what was the motivation for doing this in the first place? And, like, was there any particular people you feel really should read it?Dawn Nafus: Yeah, absolutely. We primarily wrote this for ourselves. In a way. And I'll explain what I mean by that. So, oddly, it actually started life in my role in Responsible AI, where I had recently advocated that Intel should adopt a Protect the Environment principle alongside our suite of other Responsible AI principles, right?Bias and inclusion, transparency, human oversight, all the rest of it. And so, the first thing that comes up when you advocate for a principle, and they did actually implement it, is "what are you going to do about it?" And so, we had a lot of conversation about exactly that, and really started to hone in on energy transparency, in part because, you know, from a governance perspective, that's an easy thing to at least conceptualize, right? You can get a number.Chris Adams: Mmm. Dawn Nafus: You know, it's the place where people's heads first go to. And of course it's the biggest part of, or a very large part of the problem in the first place. Something that you can actually control at a development level. And so, but once we started poking at it, it was, "what do we actually mean by measuring? And for what? And for whom?" So as an example, if we measured, say, the last training run, that'll give you a nice guesstimate for your next training run, but that's not a carbon footprint, right? A footprint is everything that you've done before that, which folks might not have kept track of, right?So, you know, we're really starting to wrestle with this. And then in parallel, in labs, we were doing some socio technical work on, carbon awareness. And there too, we had to start with measuring. Right? You had to start somewhere. And so that's exactly what the team did. And they found interestingly, or painfully depending on your point of view, look, this stuff ain't so simple, right?If what you're doing is running a giant training run, you stick CodeCarbon in or whatever it is, sure, you can get absolutely a reasonable number. If you're trying to do something a little bit more granular, a little bit trickier, it turns out you actually have to know what you're looking at inside a data center, and frankly, we didn't, as machine learning people primarily. And so, we hit a lot of barriers and what we wanted to do was to say, okay, there are plenty of other people who are going to find the same stuff we did, so, and they shouldn't have to find out the hard way. So that was the motivation.Chris Adams: Well, I'm glad that you did because this was actually the thing that we found as well, when we were looking into this, it looks simple on the outside, and then it turned, it feels a bit like a kind of fractal of complexity, and there's various layers that you need to be thinking about. And this is one thing I really appreciated in the paper that we actually, that, that was kind of broken out like that.So you can at least have a model to think about it. And Charles, maybe this is actually one thing I can, like, hand over to you because I spoke about this kind of hierarchy of things you might do, like there'sstuff you might do at a data facility level or right all the way down to a, like, a node level, for example.Can you take me through some of the ideas there? Because I know for people who haven't read the paper yet, that seemed to be one of the key ideas behind this, that there are different places where you might make an intervention. And this is actually a key thing to take away if you're trying to kind of interrogate this for the first time.Charles Tripp: Yeah, I think it's, both interventions and measurement, or I should, it's really more estimation at any level. And it also depends on your goals and perspective. So it, like, if you are operating a data center, right? You're probably concerned with the entire data center, right? Like the cooling systems, the idle power draw, the, converting power to different levels, right?Like transformer efficiency, things like that. Maybe even the transmission line losses and all of these things. And you may not really care too much about, like, the code level, right? So the types of measurements you might take there or estimates you might make are going to be different. They're gonna be at, like, the system level.Like, how much is my cooling system using in different conditions, different operating conditions, environmental conditions? From a user's perspective, you might care a lot more about, like, how much energy, how much carbon is this job using? And that's gonna depend on those data center variables. But there's also a degree of like, well, the data center is going to be running whether or not I run my job.Right? So I really care about my jobs impact more. And then I might be caring about much shorter term, more local estimates, like ones that, might be from measuring the nodes that I'm running on's power or which was what we did it at NREL or, much higher frequency, but less accurate measurements that come from the hardware itself.Most modern computing hardware has a way to get these hardware estimates of the current power consumption. And you could log those. And there's also difficulties. Once you start doing that is the measurement itself can cause energy consumption. Right? And also potentially interfere with your software and cause it to run more slowly and potentially use more energy.And so, like, there's difficulties there at that level. Yeah, but there's a whole suite of tools that are appropriate for different uses and purposes, right? Like measuring the power at the wall, going into the data center may be useful at the data center or multiple data center level. Still doesn't tell you all the story, right?Like the losses in the transmission lines and where did that power come from are still not accounted for, right? But it also doesn't give you a sense for, like, what happens that I take interventions at the user level? It's very hard to see that from that high level, right? Because there's many things running on the system, different conditions there. From the user's point of view, they might only care about, like, you know, this one key piece of my software that's running, you know, like the kernel of this deep learning network.How much energy is that taking? How much additional energy is that taking? And that's like a very different thing that very different measurements are appropriate for and interventions, right?Like changing that little, you know, optimizing a little piece of code versus like, maybe we need to change the way our cooling system works on the whole data center or the way that we schedule jobs. Yeah, and the paper goes through many of these levels of granularity.Chris Adams: Yeah, so this is one thing that really kind of struck out at me because when you, it started at the kind of facility level, which is looking at an entire building where you mentioned things like say, you know, power coming into the entire facility. And then I believe you went down to looking at say the, within that facility, there might be one or more data centers, then you're going down to things like a rack level and then you're going down tokind of at a node level and then you're all even going all the way down to like a particularly tight loop or the equivalent for that. And when you're looking at things like this, there are questions about like what you what... if you would make something particularly efficient at, say, the bottom level, the node level, that doesn't necessarily impact, it might not have an impact higher up, for example, because that capacity might be just reallocated to someone else.For example, it might just be that there's a certain kind of minimum amount of power draw that you aren't able to have much of an impact on. I mean, like, this is, these are some of the thingsI was surprised by, or not surprised by, but I really appreciated breaking some of that, these out, because one thing that seemed to, one thing that was, I guess, counterintuitive when I was looking at this was that things you might do at one level can actually be counter, can hinder steps further down, for example, and vice versa.Charles Tripp: Yeah, that's right. I mean, I think, two important sort of findings are, yeah, like battle scars that we got from doing these measurements. And one data set we produced is called BUTTER-E, which is like a really large scale measurement of energy consumption of training and testing neural networks and how the architecture impacts it.And we were trying to get reasonable measurements while doing this. And, of the difficulties is in comparing measurements between runs on different systems, even if they're identically configured, can be tricky because different systems based on, you know, manufacturing variances, the heat, you know, like how warm is that system at that time?Anything that might be happening in the background or over the network, anything that might be just a little different about its environment can have, real measurable impacts on the energy consumed. So, like comparing energy consumption between runs on different nodes, even with identical configurations, we had to account for biases and they're like, oh, this node draws a little bit more power than this one at idle.And we have to like, adjust for that in order to make a clear comparison of what the difference was. And this problem gets bigger when you have different system configurations or even same configuration, but running in like a totally different data center. So that was like one tricky finding. And I think two other little ones I can mention, maybe we could go into more detail later. But, another one, like you mentioned, is the overall system utilization and how that's impacted by a particular piece of software running a particular job running is going to vary a lot on what those other users of the system are doing and how that system is scheduled.So, you can definitely get in the situations where, yeah, I reduced my energy consumption, but that total system is just going to, that energy is going to be used some other time, especially if the energy consumption savings I get are from shortening the amount of time I'm using a resource and then someone else.But it does mean that the computing is being done more efficiently, right? Like, if everyone does that, then more computing can be done within the same amount of energy. But it's hard to quantify that. Like, what is my impact? It's hard to say, right?Chris Adams: I see, yeah, and Dawn, go on, I can, see you nodding, so I want you to come in now.Dawn Nafus: If I can jump in a bit, I mean, I think that speaks to one of the things we're trying to bring out, maybe not literally, but make possible, is this. Those things could actually be better aligned in a certain way, right? Like, the energy that is, you know, for example, when there is idle time, right?I mean, there are things that data center operators can do to reduce that, right? you know, you can bring things into lower power states, all the rest of it, right? So, in a way, kind of, but at the same time, the developer can't control it, but if they don't actually know that's going on, and it's just like, well, it's there anyway, there's nothing for me to do, right, that's also a problem, right?So in a way, you've got two different kinds of actors looking at it in very different perspectives. And the clearer we can get about roles and responsibilities, right, you can start to do things like reduce your power when things are idling. Yes, you do have that problem of somebody else is going to jump in. But Charles, I think as your work shows, you know, there's still some idling going on, even though you wouldn't think, so maybe you could talk a little bit about that.Charles Tripp: Yeah, so one really interesting thing that I didn't expect going into doing these measurements in this type of analysis was, well, first, I thought, "oh great, we can just measure the power on each node, run things and compare them." And we ran into problems immediately. Like, you couldn't compare the energy consumption from two identically configured systems directly, especially if you're collecting a lot of data, because one is just going to use like slightly more than the other because of the different variables I mentioned.And then when you compare them, you're like, well, that run used way more energy, but it's not because of anything about how the job was configured. It's just, that system used a little bit more. So if I switch them, I'd get the opposite result. So that was one thing. But then, as we got into it and we were trying to figure out, okay, well, now that we figured out a way to account for these variations, let's see what the impact is of running different software with different configurations, especially like neural networks, different configurations on energy consumption and our initial hypothesis was that it was based on mainly the size of the neural network and, you know, like how many parameters basically, like how many calculations, these sorts of things.And if you look in the research, A lot of the research out there about making neural networks and largely algorithms in general more efficient focuses on how many operations, how many flops does this take, you know? And look, we reduced it by a huge amount. So that means that we get the same energy consumption reductions.We kind of thought that was probably true for the most part. But as we took measurements, we found that had almost no connection to how much energy was consumed. And the reason was that the amount of energy consumed had way more to do with how much data was moved around on the computer. So how much data was loaded from the network?How much data was loaded from disc? How much data was loaded from disc into memory, into GPU RAM for using the GPU, into the different caching levels and red, even the registers? So if we computed like how much data got moved in and out of like level two cache on the CPU, we could see that had a huge correlation, like almost direct correlation with energy consumption. Not the number of calculations.Now, you could get in a situation where, like, basically no data is leaving cache, and I'm doing a ton of computing on that data. In that case, probably a number of calculations does matter, but in most cases, especially in deep learning, has almost no connections, the amount of data moved. So then we thought, okay, well, it's amount of data moved.It's the data moving. The data has a certain cost. But then we look deeper, and we saw that actually. The amount of data moved is not really what's causing the energy to be consumed. It's the stalls while the system is waiting to load the data. It's waiting for the data to come from, you know, system memory into level three cache.It needs to do some calculations on that data. So it's pulling it out while it's sitting there waiting. It's that idle power draw. Just it could be for like a millisecond or even a nanosecond or something, right? But it adds up if you have, you know, billions of accesses. Each of those little stalls is drawing some power, and it adds up to be quite a significant amount of power.So we found that actually the driver of the energy consumption, the primary driver by far in what we were studying in deep learning was the idle power draw while waiting for data to move around the system. And this was like really surprising because we started with number of calculations, it turns out almost irrelevant.Right. And then we're like, well, is it the amount of data moved around? It's actually not quite the amount of data moved around, but that does like cause the stalls whenever I need to access the data, but it's really that idle power draw. And and I think that's probably true for a lot of software.Chris Adams: Yes. I think that does sound about right.I'm just gonna try if I follow that, because there was, I think there was a few quite key important ideas there. But there's also, if you aren't familiar with how computers are designed, you it might, there. I'll try to paraphrase it. So we've had this idea that the main thing is like, the number of calculations being done. That's like what we thought was the key idea.But, Charles Tripp: How much work, you know.Chris Adams: Yeah, exactly. And, what we actually, what we know about is inside a computer you have like multiple layers of, let's call them say, caches or multiple layers at where you might store data so it's easy and fast to access, but that starts quite small and then gets larger and larger, which a little bit slower over time.So you might have, like you said, L2 cache, for example, and that's going to be smaller, much, much faster, but smaller than, say, the RAM on your system, and then if you go a bit further down, you've got like a disk, which is going to be way, what larger, and then that's going to be somewhat slower still, so moving between these stages so that you can process, that was actually one of the things that you were looking at, and then it turned out that actually, the thing that, well, there is some correlation there, one of the key drivers actually is the chips kind of in a ready state, ready to actually waiting for that stuff to come in.They can't really be asleep because they know the data is going to have to come in, have to process it. They have to be almost like anticipating at all these levels. And that's one of the things we, that's one of the big drivers of actually the resource use andthe energy use. Charles Tripp: I mean, so, like, what we saw was, we actually estimated how much energy it took, like, per byte to move data from, like, system RAM to level three cache to level two to level one to a register at each level. And at some cases, it was so small, we couldn't even really estimate it. But in most cases, we were able to get an estimate for the For that, but a much larger cost was initiating the transfer, and even bigger than that was just the idle power draw during the time that the program executed and how long it executed for. And by combining those, we were able to estimate that most of that power consumption, like 99 percent in most cases was from that idle time, even those little micro stalls waiting for the data to move around. And that's because moving the data while it does take some energy doesn't take that much in comparison to the amount of energy of like keeping the ram on and the data is just like alive in the ram or keeping the CPU active, right?Like CPUs can go into lower power states, but generally, at least part of that system has to shut down. So like doing it like at a very, fine grain scale is not really feasible. Many systems can change power state at a like a faster rate than you might imagine, but it's still a lot slower than like out of, you know, per instruction per byte level of, like, I need to load this data.Like, okay, shut down the system and wait a second, right? Like, that's, it just, not a second, like a few nanoseconds. It's just not practical to do that. And it's so it's just keeping everything on during that time. That's sucking up most of the power. the So one strategy, simple strategy, but it's difficult to implement in some cases is to initiate that load that transfer earlier.So if you can prefetch the data into the higher levels of memory before you hit the stall where you're waiting to actually use it,you can probably significantly reduce this power consumption, due to that idle wait. But it's difficult to figure out how to properly do that prefetching. Chris Adams: Ah, I see. Thanks, charles. So it sounds like, okay, they, we might kind of approach this and there might be some things which feel kind of intuitive but it turns out there's quite a few counterintuitive things.And like, Dawn, I can see you nodding away sagely here and I suspect there's a few things that you might have to add on this. Because this is, I mean, can I give you a bit of space, Dawn, to kind of talk about some of this too, because I know that this is something that you've shared with me before, is that yeah, there are maybe some rules of thumb you might use, but it's never that simple, basically, or you realise actually that there's quite a bit more to it than that, for example.Dawn Nafus: Exactly. Well, I think what I really learned out of this effort is that measurement can actually recalibrate your rules of thumbs, right? So you don't actually have to be measuring all the time for all reasons, but even just that the simple, I mean, not so simple story that Charles told like, okay, you know, so I spent a lot of time talking with developers and trying to understand how they work and at a developer perception level, right?What do they feel like? What's palpable to them, right? Send the stuff off, go have a cup of coffee, whatever it is, right? So they're not seeing all that, you know, and, you know, when I talk to them, most of them aren't thinking about the kinds of things that were just raised, right? Like how much data are you looking at a time?You can actually set and tweak that. And that's the kind of, you know, Folks develop an idea about that, and they don't think too hard about it usually, right. So, with measuring, you can start to actually recalibrate the things you do see, right? I think this also gets back to, you know, why is it counterintuitive that, you know, some of these mechanisms and how you actually are training, as opposed to how many flops you're doing, how many parameters, why is that counterintuitive?Well, at a certain level, you know, the number of flops do actually matter, right? If we do actually have a gigantic, you know, I'm gonna call myself a foundation model type size stuff, I'm gonna build out an entire data center for it, it does matter. But as you get, you know, down and down and more specific, it's a, different ball game.And there are these tricks of scale that are sort of throughout this stuff, right? Like the fact that, yes, you can make a credible claim, that foundation model will always be more energy intensive than, you know, something so small you can run on a laptop, right? That's always going to be true, right? No measurement necessary, right? You keep going down and down, and you're like, okay, let's get more specific. You can get to actually where this, where our frustration really started was, you, if you try to go to the extreme, right, try to chase every single electron through a data center, you're not going to do it. It feels like physics, it feels objective, it feels true, but at minimum you start to hit the observer effect, right, that, you know, which is what we did.We were, my colleague Nicole Beckage was trying to measure at an epoch level, right, sort of essentially round, you know, mini round of training. And what she found was that, you know, she was trying to sample so often that she's pulling energy out of the processing and it just, it messed up the numbers, right? So you can try to get down, you know, into that, you know, what feels like more accuracy and then all of a sudden you're in a different ballpark. So these, tricks of like aggregation and scale and what can you say credibly at what level, I think are fascinating, but you kind of got to get a feel for it in the same way that you can get a feel for, "yep, if I'm sending my job off, I know I have at least, you know, however many hours or however many days," right?Charles Tripp: There's also so much variation that's out of your control, right? Like one run to another one system to another, even different times where you ran on the same system can cause measureable and in some cases significant variations in the energy consumption.So it's more about, I think about understanding what's causing the energy consumption.I think that's the more valuable thing to do. But it's easy to like, be like, "I already understand it." And I think there's a, there's like a historical bias towards number of operations because in old computers without much caching or anything like this, right? Like I restore old computers and, like an old 386 or IBM XT, right?Like it's running, it has registers in the CPU and then it has main memory. And it, and almost everything is basically how many operations I'm doing is going to closely correlate with how fast the thing runs andprobably how much energy it uses, because most of the energy consumption on those systems Is just basically constant, no matter what I'm doing, right?It's just it doesn't like idle down the processor while it's not working, right? And there's a historical bias. It's built up over time that, like, was focused on the, you know, and it's also at the programmer level. Like, I'm thinking about what is the computer doing? Chris Adams: What do I have controll over?Charles Tripp: But it's only through it's only through actually measuring it that you gain a clearer picture of, like, what is actually using energy.And I think if you get that picture, then you'll gain an understanding more ofhow can I make this software or the data center or anything in between like job allocation more energy efficient, but it's only through actually measuring that we can get that clear picture. Because if we guess, especially using kind of our biases from how we learn to use computers, how we learn about how computers work, we're actually very likely to get an incorrect understanding, incorrect picture of what's driving the energy consumption.It's much less intuitive than people think.Chris Adams: Ah, okay, there's a couple of things I'd like to comment on, and then Dawn, i might give you a bit of space on this, because, you said, so there's one, so we're just talking about like flops as a thing that people, okay, are used to looking at, and are like, it's literally written into the AI Act, like, things above a certain number of flops are considered, you know, foundational models, for example, so, you know, that's a really good example of what this actually might be.And I guess the other thing that I wanted to kind of like touch on is that, I work in the kind of web land, and like, I mean, the Green Web Foundation is a clue in our organization's name. We've had exactly the same thing, where we've been struggling to understand the impact of, say, moving data around, and whether, how much credence you should give to that versus things happening inside a browser, for example.It looks like you've got some similar kinds of issues and things to be wrestling, with here. But Dawn, I wanted to give you a bit of space because both of you alluded to this, about this idea of having an understanding of what you can and what you can't control and, how you might have a bias for doing one thing without, and then miss something really much larger elsewhere, for example.Can I maybe give you a bit of space to talk about this idea of, okay, well, which things do you, should you be focusing on, and also understanding of what's within your sphere of influence? What can you control? What can't you control, for example?Dawn Nafus: Exactly. I think it's in a sense you've captured the main point, which is, you know, that measurements are most helpful when they are relevant to the thing you can control, right? So as a very simple example, you know, there are plenty of AI developers who have a choice in what data centers they can use.There are plenty who don't, right? You know, when Charles works or worked at NREL, right. The supercomputer was there. That was it. You're not moving, right? So, if you can move, you know, that overall data center efficiency number that really matters because you can say, alright, "I'm putting my stuff here and not there." If you can't move, like, there's no need to mess with. It it is what it is, right? At the same, and this gets us into this interesting problem, again, a tension between what you might look at it from a policy perspective versus what a developer might look at. We had a lot of kind of, you know, can I say, come to Jesus?We had a little momentwhere we, is that on a podcast? I think I can. Where there was this question of, are we giving people a bum steer by focusing at, you know, granular developer level stuff, right? Where it's so much actually is on how you run the data center, right? So you, again, you talk about tricks of scale. On the one hand, you know, the amount of energy that you might be directly saving just by, you know, not using or not using, by the time all of those things move through the grid and you're talking about coming, you know, energy coming off of the transmissions cables, right, in aggregate might not actually be directly that big. It might be, but it might not be. And then you flip that around and you think about what aggregate demand looks like and the fact that so much of AI demand is, you know, that's what's putting pressure on our electricity grid.Right? Then that's the most effective thing you could do, is actually get these, you know, very specific individual jobs down and down, right? So, again, it's all about what you can control, but there are these, whatever perspective you take is just going to flip your, you know, your understanding of the issue around.Chris Adams: So this was actually one thing I quite appreciated from the paper. There were a few things saying, and it does touch on this idea, that yeah, you, might be focusing on the thing that you feel that you're able to control, but just because you're able to, like, Make very efficient part of this spot here that doesn't necessarily translate into a saving higher up in the system. Simply because if it's, if you don't, if higher up in the system isn't set to actually take advantage of that, then you might never achieve some of these savings It's a little bit like when you're working in cloud, for example, people tell you do all these things to kind of optimize your cloud savings. But if people are not turning data centers off, at best, you might be slowing the growth of infrastructure rollout in future, and like these are, and these are much, much harder things to kind of claim responsibility for, or say that, "yeah, it was definitely, if it weren't for me doing those things, we wouldn't have had that happen."This is one of the things that I appreciated the paper just making some allusions to and saying, look, yeah, this is, you know, this is why I mean, to be honest, when I was reading this, I was like, wow, there is, there was obviously some stuff for, beginners, but there's actually quite a lot here, which is quite meaty for people who are thinking of it as a much larger systemic level.So there's definitely things like experts could take away from this as well. So, I just want to check, are there any particular takeaways the two of you would like to kind of draw people's attention to beyond what we've been discussing so far? Because I quite enjoyed the paper and there's a few kind of nice ideas from this. Charles, if I just give you a bit of space to, kind of, come in. Charles Tripp: Yeah. I've got, kind of two topics that I think build on what we talked about before, but could be really useful for people to be aware of. So one is, sort of one of the outcomes of our studying of the impact of different architectures, data sets, hyper parameter settings on deep neural network energy consumption was that the most efficient networks, most energy efficient networks, and largely that correlates with most time efficient as well, but not always, the most efficient ones were not the smallest ones, and they were not the biggest ones, right?The biggest ones were just required so much data movement. They were slow. The smallest ones, they took a lot more iterations, right? It took a lot more for them to learn the same thing. And the most efficient ones were the ones where the working sets, where the amount of data that was moved around, matched the different cache sizes.So as you made the network bigger, it got more efficient because it learned faster. Then when it got so big that the data in like between layers, the communication between layers, for example, started to spill out of a cache level. Then it became much less energy efficient, because of that data movement stall happening.So we found that like there is like an optimum point there. And for most algorithms, this is probably true where if the working set is sized appropriately for the memory hierarchy, you gain the most efficiency, right? Because generally, like, as I can use more data at a time, I can get my software to work better, right, more efficiently. But there's a point where it falls out of the cache and that becomes less efficient. Exactly what point is going to depend on the software. But I think focusing on that working set size and how it matches to the hardware is a really key piece for almost anyone looking to optimize software for energy efficiency is to think about that. How much data am I moving around and how does that map to the cache? So that's like a practical thing.Chris Adams: Can I stop you Because I find that quite interesting, in that a lot of the time as developers we're kind of taught to kind of abstract away fromthe underlying hardware, and that seems to be going the other way. That's saying, "no, you do need to be thinking about this.You can't.There, you know, there's no magic trick." Charles Tripp: Right? And so, like, for neural networks, that could mean sizing my layers so that those working sets match the cache hierarchy, which is something that no one even considers. It's not even close in most architectures. Like, no one has even thought about this. The other thing is on your point about data center operations and kind of the different perspectives,one thing that we started to think about as we were doing some of this work was it might make sense to allocate time or in the case of like commercial data center, commercial cloud operator, even like charge field based on at least partly the energy rather than the time, as to incentivize them to use less energy, right?Like make things more energy efficient. Those can be correlated, but not always right. And another piece of it that I want to touch on of that same puzzle is, from a lot of data center operators perspective, they want to show their systems fully utilized, right? Like there's demand for the system, so we should build an even bigger system and a better system. When it comes to energy consumption.That's probably not the best way to go, because that means that those systems are sitting there probably doing inefficient things. Maybe even idling a lot of time, right? Like a user allocated the node, but it's just sitting there doing nothing, right? It may be more useful instead of thinking about, like, how much is the system always being utilized?But think about how much, how much computation or how many jobs or whatever your, like, utilization metric is, do I get, like, per unit energy, right? And you may think about how much, or per unit carbon, right? And you may also think about, like, how much energy savings can I get by doing things like shutting down nodes when they're unlikely to be utilized and more about like having a dynamic capacity, right?Like full tilt. I can use I can do how many flops or whatever, right? But I can also scale that down to reduce my idle power draw by, you know, 50 percent in low demand conditions. And if you have that dynamic capacity, you may actually be able to get even more throughput. But it's with less energy because when there's no demand, I'm like shutting,I'm like scaling down my data center, right? And then when there's demand, I'm scaling it up. But these are things that are requiring cultural changes in data center operations to happen.Chris Adams: I'm glad you mentioned this thing here because, Dawn, I know that you had some notes about, it sounds like in order for you to do that, you need, you probably need different metrics exposed or different kinds of transparency to what we have right now.Probably more actually. Dawn, can I give you a bit of space to talk about this? Because this is one thing that you told me about before and it's something that is actually touched on in the paper quite a few times actually.Dawn Nafus: Yeah, I mean, I think we can notice a real gap in a way between the kinds of things that Charles brings his attention to, and the kinds of things that show up in policy environments, in responsible AI circles, right, where I'm a bit closer, we can be a bit vague, and I think we are at the stage where, at least my read on the situation, is that, you know, there's, regardless of where you sit in the debates, and there are rip roaring debates about what to do about the AI energy situation, but I think transparency is probably the one thing we can get the most consensus on, but then, like, just back to that, what the heck does that mean? And I think we need a little, like a, more beats than are currently given to actually where, what work are those measurements doing?You know, some of the feedback we've gotten is, you know, "well, can't you just come up with a standard?" Like, what's the right standard? It's like, well, no, actually, if data centers aren't standard, and there are many different ways to build a model, then, yes, you can have a standard as a way of having a conversation across a number of different parties to do a very specific thing, like for example, Charles's example, you know, suggested that if we're charging on a per energy basis, that changes a whole lot. Right? But what you can't do is to say, this is the standard that is the right way to do it, and then that meets the requirement, because that's, you know, what we found is that clearly the world is far more, you know, complicated and specific than that.So, I, you know, I would really encourage the responsible AI community to start to get very specific very quickly, which I don't yet see happening, but I think it's just on the horizon. Chris Adams: Okay. Well I'm glad you mentioned about maybe taking this a little bit wider 'cause we've dived quite spent a lot of time talking about this paper, but there's other things happening in the world of AI actually, and I wanna give you folks a bit of space to kind of talk about anything that like, or things that you are, that you would like to kind of direct some attention to or you've seen that really you found particularly interesting.Charles, can I give you some space first and then give Dawn the same, to like say it to like I know, either shout out or point to some particular things that, if they've found this conversation interesting so far, what they might want to be looking at. More data.Charles Tripp: Yeah. I mean, I think, both in like computer program, computer science at large and especially in machine learning, we've kind of had an attitude, especially within deep learning within machine learning, an attitude of throwing more compute at the problem, right? And more data. The more data that we put through a model and the bigger, the more complicated the model is, the more capable it can be.But this brute force approach is one of the main things that's driving this increasing computing energy consumption. Right? And I think that it is high time that we start taking a look at making the algorithms we use more energy efficient instead of just throwing more compute. It's easy to throw more compute at it, which is why it's been done.And also because there hasn't been a significant like material incremental cost of like, Oh, you know, now we need. Twice made GPUs. I don't big deal. But now we're starting to hit constraints because we haven't thought about that incremental energy costs. We haven't had to, as an industry at large, right?Like, but now it's starting to be like, well, we can't build that data center because we can't get the energy to it that we need to do the things we want to do with it because we haven't taken that incremental cost into account over time, we just kind of ignored it. And now we hit like the barrier, right? And so I think thinking about, the energy costs and probably this means investing in more finding more efficient algorithms, more efficient approaches as well as more efficient ways to run data centers and run jobs. That's gonna become increasingly important, even as our compute capacity continues to increase.The energy costs are likely to increase along with that as we use more and more, and we need create more generation capacity, right? Like, it's expensive at some point where we're really driving that energy production, and that's going to be increasingly an important cost as well as it is now, like, starting to be a constraint to what kind of computing we can do.So I think investing in more efficient approaches is going to be really key in the future. Chris Adams: There's one thing that I, that I think Dawn might come in on this actually, is that, you're talking about, it seems that you're talking about having more of a focus on surfacing some of the kind of efficiency or the fact that resource efficiency is actually going to be something that we probably need to value or sharpen, I mean, because as I understand it so far, it's not particularly visible in benchmarks or anything like that right now, like, and if you have benchmarks deciding, what counts as a good model or a good use of this until that's included. You're not going to have anything like this. Is that the kind of stuff you're kind of suggesting we should probably have? Like, some more recognition of, like, or even like, you're taking at the energy efficiency of something and being that thing that you draw attention to or you include in counting something as good or not, essentially.Dawn Nafus: You know, I have a particular view of efficiency. I suspect many of your listeners might, as well. You know, I think it's notable that at the moment when we're seeing the, you know, the the model of the month, apparently, or the set of models of DeepSeek has come onto the scene and immediately we're starting to see, for the first time, you know, a Jevons paradox showing up in the public discourse.So this is the paradox that when you make things more efficient, you can also end up stimulating so much demand... Chris Adams: Absolute use grows even though it gets individually more efficient.Dawn Nafus: Yeah, exactly. Again, this is like this topsy turvy world that we're in. And so, you know, now the Jevons paradoxes is front page news, you know, my view is that yes, you know, again, we need to be particular about what sorts of efficiencies are we looking for where and not, you know, sort of willy nilly, you know, create an environment where, which I'm not saying you're doing Charles, but you know, what we don't want to do is create an environment where if you can just say it's more efficient, then, somehow, you know, we're all good, right. Which is, you know, what some of the social science of Energy Star has actually suggested that, that stuff is going on. With that said, right, I am a big fan of the Hugging Face Energy Star initiative. That looks incredibly promising. And I think one of the things that's really promising about it, so this is, you know, you know, leaderboards when, you know, people put their models up on Hugging Face. There's some energy measurement that happens, some carbon measurement, and then, you know, leaderboards are created and all the rest of it. And I think one of the things that's really good at, right, I can imagine issues as well, but you're A, you know, creating a way to give some people credit for actually looking. B, you're creating a way of distinguishing between two models very clearly, right? So in that context, do you have to be perfect about how many kilowatts or watts or whatever it is? No, actually, right? Right? You know, you're looking at more or less in comparable models. But C, it also interjects this kind of path dependence. Like, who is the next person who uses it? Right?That really matters. If you're setting up something early on, yes, they'll do something a little bit different. They might not just run inference on it. But you're, changing how models evolve over time and kind of steering it towards even, you know, having energy presence at all. So that's pretty cool to my mind.So I'm looking forward to... Chris Adams: Cool. We'll share a link to the Hugging Face. I think they, I think, do you know what they were called? I think it's the, you might be, I think it's, it was initially called the Energy Star Alliance, and then I think they've been told that they need to change the name to the Energy Score Alliance from this, because Ithink it, Energy Star turned out to be a trademark, but we can definitely add a link to that in the show notes, because, these, this actually, I think it's something that is officially visible now. It's something that people have been working on late last year, and now there is, we'll share a link to the actual GitHub repo, to the code on GitHub to kind of run this, because this works for both closed source models and open source models. So it does give some of that visibility. Also in France, there is the Frugal LLM challenge, which also sounds similar to what you're talking about, this idea of essentially trying to emphasize more than just the, you know, like to pay a bit more attention to the energy efficiency aspect of this and I'm glad you mentioned the DeepSeek thing as well because suddenly everyone in the world is an armchair expert on William Stanley Jevons paradox stuff.Everybody knows! Yeah. Dawn Nafus: Actually, if I could just add one small thing, since you mentioned the Frugal effort in France, there's a whole computer science community, sort of almost at a step's length from the AI development community that's really into just saying, "look, what, you know, what is the purpose of the thing that I'm building, period."And even, and that, you know, frugal computing, computing within limits, all of that world really about how do we get, you know, just something that somebody is going to actually value, as opposed to, you getting to the next, you know, score on a benchmark leaderboard somewhere. so I think that's kind of also lurking in the background here.Chris Adams: I'm glad you mentioned this, what we'll do, we'll add a we'll add links to both of those and, you immediately make me think of, there is this actual, so we're technologists mostly, the three of us, we're talking about this and I work in a civil society organization and, just this week, there was a big announcement, like a kind of set of demands from civil society about AI that's being shared at the AI Action Summit, this big summit where all the great and good are meeting in Paris, as you alluded to, next week to talk about what should we do about this? And, they, it's literally called Within Bounds, and we'll share a link to that. And it does talk about this, like, well, you know, if we're going to be using things like AI, what do, we need to have a discussion about what they're for. And that's the first thing I've seen which actually has discussions about saying, well, we should be actually having some concrete limits on the amount of energy for this, because we've seen that if this is a constraint, it doesn't stop engineers.It doesn't stop innovation. People are able to build new things. What we should also do is we should share a link to, I believe, Vlad Coraoma. he did an interview with him all about Jevons paradox a few, I think, late last year, and that's a really nice deep dive for people who want to basically sound knowledgeable in these conversations on LinkedIn or social media right now, it's a really useful one there as well. Okay, so we spoke a little bit about these ones here. Charles, are there any particular projects you'd like to kind of like name check before we start to wrap up? Because I think we're coming up to the hour now, actually.Charles Tripp: I don't know, not particular, but I did mention earlier, you know, we published this BUTTER-E data set and a paper along with it, as well as a larger one without energy measurements called BUTTER. Those are available online. You can just search for it and you'll find it right away. I think, if that's of interest to anyone hearing this, you know, there's a lot of measurements and analysis in there, including, you know, all the details of analysis that I mentioned where we, had this journey from number of compute cycles to, like, amount of stall, in terms of what drives energy consumption. Chris Adams: Ah, it's visible so people can see it. Oh, that's really cool. I didn't realize about that. Also, while you're still here, Charles, while I have access to you, before we did this interview, you mentioned, there's a whole discussion about wind turbines killing birds, and you were telling me this awesome story about how you were able to model the path of golden eagles to essentially avoid these kind of bird strike stuff happening.Is that in the public domain? Is something, can we link to that? That sounded super cool. Charles Tripp: There's several, papers. I'll have to dig up the links, but there's several papers we published and some software also to create these models. But yeah, I worked on a project where we looked at, we took, eagle biologists and computational fluid dynamics experts and machine learning experts.And we got together and we created some models based off of real data, real telemetry of tracking, golden eagle flight paths through, well, in many locations, including at wind sites, and match that up with the atmospheric conditions, the flow field, like, or graphic updrafts, which is where the wind hits, you know, like a mountain or a hill and it, some of it blows up.Right. And golden eagles take advantage of this as well as thermal updrafts caused by heating at the ground. Right. Causing the air to rise to fly. Golden eagles don't really like flapping. They like gliding. And because of that, golden eagles and other soaring birds, their flight paths are fairly easy to predict, right?Like, you may not know, like, oh, are they going to take a left turn here or right turn there, but generally they're going to fly in the places where there's strong updrafts and using actual data and knowledge from the eagle biologists and simulations of the flow patterns, we were able to create a model that allows wind turbines to be cited and also operate, right?Like, what, under what conditions, like, what wind conditions in particular and what time of year, which also affects the eagles' behavior, should I perhaps reduce my usage of certain turbines to reduce bird strikes? And in fact, we showed that it could be done without significantly, or even at all, impacting the energy production of a wind site.You could significantly reduce the chances of colliding with a bird.Chris Adams: And it's probably good for the birds too, as well, isn't it? Yeah.Alright, we definitely need to find some links for that. That's, going to be absolute catnip for the nerdy listeners who put, who are into this. Dawn, can I just give you the last word? Are there any particular things that you'd like to, I mean actually I should ask like, we'll add links to like you and Charles online, but if there's anything that you would draw people's attention to before we wrap up, what would you pay, what would you plug here? Dawn Nafus: I actually did want to just give a shout out to National Renewable Energy Lab, period. One of the things that are amazing about them, speaking of eagles, a different eagle is, they have a supercomputer called Eagle. I believe they've got another one now. It is lovingly instrumented with all sorts of energy measurements, basically anything you can think to measure.I think you can do it in there. There's another data set from another one of our co authors, Hilary Egan, that has some sort of jobs data. You can dig in and explore like what a real world data center job, you know, situation looks like. So I just want to give all the credit in the world to National Renewable Energy Lab and the stuff they do on the computing side.It's just phenomenal.Chris Adams: Yes, I think that's a really, I would echo that very much. I'm a big fan of NREL and the output for them. It's a really like a national treasure Folks, I'm really, thank you so much for taking me through all of this work and diving in as deeply as we did and referring to things that soar as well, actually, Charles. I hope we could do this again sometime soon, but otherwise, have a lovely day, and thank you once again for joining us. Lovely seeing you two again.Charles Tripp: Good seeing you.Chris Adams: Okay, ciao!  Hey everyone, thanks for listening. Just a reminder to follow Environment Variables on Apple Podcasts, Spotify, or wherever you get your podcasts. And please do leave a rating and review if you like what we're doing. It helps other people discover the show, and of course, we'd love to have more listeners.To find out more about the Green Software Foundation, please visit greensoftware.foundation. That's greensoftware.foundation in any browser. Thanks again, and see you in the next episode.  

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app