The Myth of Data Center Inefficiency
While historically singled out as large energy users, recent studies show that enormous progress has been made. Eric Masanet talks through the latest innovations in this space.
Announcer: Welcome to “Not Your Father’s Data Center Podcast” brought to you by Compass Datacenters. We build for what’s next. Now, here’s your host, Raymond Hawkins.
Raymond: Welcome everybody to another edition of “Not Your Father’s Data Center.” I’m your host, Raymond Hawkins. And today, we’ll be joined by Dr. Eric Masanet. Eric joins us with an impressive resume, holding degrees from the University of Wisconsin, masters from Northwestern, and a Ph.D. from that little school you might’ve heard of Cal Berkeley. On today’s podcast, we’ll be talking about energy use in the data center industry and how it impacts global energy use. You guys will be surprised to learn that this question is asked by some really smart people who’ve done a lot of incredible study on it and released some academic papers on how the data center is impacting energy use on our globe today. And look forward to digging into that subject with Dr. Masanet. Eric, I think you’re joining us today from Santa Barbara, is that right? I think that’s a recent move for you.
Dr. Masanet: That’s right. Yeah, UC, Santa Barbara.
Raymond: All right. Lovely out there. And again, I think I’m struggling to understand how you could leave Chicago in February for UC, Santa Barbara, but there’s no explaining tastes, so good move on your part.
Dr. Masanet: Yeah.
Raymond: Well, today, I think what we want to compare notes on today is to talk about energy in the data center. I think in our space largely gets misunderstood, is it a net positive for society, is it a net negative, is it hurting the planet, is it helping the planet? There’s lots of good things that come out of computing, lots of modeling, and lots of understanding, but also lots of energy use and trying to balance this perception of is the data center space and the technology, should we support it? Is it a net good thing for mankind and the planet or a net negative? And I think, Eric, you and your team have honed in on the question specifically around energy use. Is that too big a summary or is that pretty accurate?
Dr. Masanet: No, that’s pretty accurate. But if we think about the energy impacts of digital services, so streaming versus, you know, driving to the video store in the old days or what most of us are doing now, remote working as opposed to driving into the office. To figure out whether that’s a net positive from an energy perspective or any other environmental impact, one has to look at the whole system. And we focus specifically in this paper on the data center, a key component of the digital system because there is a lot of misunderstanding about just how much energy data centers use and where that energy use is trending.
Raymond: So, and I appreciate you doing it, Eric, alluding to the study. Do you mind doing a two or three-minute, and I’ll try to make sure that I can track with it, summary of the paper and of the research and the research you’ve been doing for quite a while, but in the paper that you most recently published as well?
Dr. Masanet: Sure. Yeah, I’d be happy to do so. Well, the paper was written to fulfill really three goals. The first is there is a lot of misperception out there about the energy use of data centers. Depending on what you read, you may think that data centers are gobbling up all the world’s energy and the situation will get even worse in the future. Or you may read that data centers are really efficient. And we wanted to basically try to recalibrate the public’s understanding and also policymakers by putting out what we thought were the most rigorous best estimates of global data center energy use, kind of to set the record straight. So, that was one goal of the paper. I wrote this paper with two…well, four colleagues, two of which I’ve been working with for a long time, Dr. Arman Shehabi and Dr. John Koomey. We go way back to a study in 2007 where we were asked to calculate the energy use of U.S data centers for the U.S. Congress. And ever since then, we’ve been working to maintain a data center energy model that we update with recent data, and occasionally, we’ll publish a study using the model to weigh in on where energy use stands for data centers. So, really the first goal was to try to put out what we thought were the best available numbers on global data center energy use.
The second key goal was we wanted to ring the alarm bell somewhat. We’ve been enjoying a lot of efficiency gains over the last decade in data centers which we can talk about more, but really those efficiency gains can’t last forever because demand for data center services is poised to grow rapidly. It has been growing rapidly and it’ll continue to go rapidly with the emergence of some key trends like artificial intelligence or 5G or edge computing. And then the third motivation was really, there aren’t enough of us, frankly, in the research community who study data center energy use, who published numbers and we wanted to motivate more research with this paper. So, we released all of our datasets, our entire model, we’re opening it up for critique because we want more people to be working on this issue because we need to get a better handle on where energy use is going moving forward. And there just aren’t enough analysts out there who specialize in data centers.
Raymond: Eric, in the paper you guys talk about…so, first of all, love the idea of it being open, love the idea of having other smart guys look at smart guys’ work and check, I think we all get better iron sharpening iron. So, I think that’s a great move and you know, it makes me think of open-source computing, right? Let’s have all the best minds look at it and let’s make sure that we’re thinking about this the right way because I think it’s an important question, right, digitization continues to increase. So, this question isn’t going away, right? People aren’t going to put away their smartphones and be happy to go back to their flip phones or not be connected at all, right? So, the digitization of the world is underway and probably not reversible. So, how do we manage these resources in a way that are smart, wise, and good for the planet? And you talk in the paper about the proliferations sort of beginning in 2010 and the growth of data centers, footprints growing, I think you mentioned the paper four or five, maybe even sixfold, but that although the volume of compute resources is going up, that it’s not a perfect straight line correlation of energy use going up and some of the reasons why. Could you dig into a little of that because I think that’s important? Yes, there’s more compute cycles being used, but the energy cycle isn’t matching at one for one and the why behind that.
Dr. Masanet: Yeah. No, you’re exactly right. Demand for compute instances or data center services has gone up sixfold, at least sixfold but the energy use associated with providing that level of compute haven’t risen nearly as fast. In fact, we found that that the energy trend isn’t quite flat, but it’s pretty close to flat. So, in the paper, we put down some numbers, compute instances have grown by over sixfold over the period 2010 to 2018, and over that same time period, energy use we think only rose by about 6%. That shows that there’s this enormous efficiency effects, meaning the data center industry is getting better and better at providing core services with very little additional energy. And we found that there are really three major reasons for this trend. The first is that the IT devices that are, you know, the workhorses of the data center, servers, storage devices, network switches, those devices have gotten a lot more efficient. And we see this in our everyday lives, right? Our cell phones can last a lot longer on a charge than they used to 10 years ago and we get a lot more service. It’s just a general trend due to technological advances and processing technologies, memory technology, storage technologies. So, that’s one explanatory factor that’s quite large. We just have much more efficient IT equipment now than we had 10 years ago.
The second major trend is that especially large data centers are virtualizing their servers to a greater degree than in the past. What virtualization means for servers, in particular, is that servers when they’re idle, they still use power. And so, if we can utilize servers at a much greater capacity level, we’re spreading out that idle energy use across many more workloads and the net effect is that each workload comes with less energy. So, the second major trend was virtualization of servers, which has increased quite drastically over the last decade. And the third trend is that a lot of workloads have been shifting to much larger cloud and hyperscale data centers, which are run much greater cooling efficiencies. So, some of the biggest data centers in the world where there’s a lot of compute happening have PUEs of 1.1 or in that ballpark. So, it’s really those three factors, more efficient IT equipments, greater capacity utilization of that equipment, particularly through virtualization of servers and then shifting workloads to the cloud and to hyper-scale which have much greater cooling and power provision efficiencies. We found that those three effects largely explain this near plateau in energy use over the last decade.
Raymond: So, love those three trends and I think it helps explain to a layperson like me, just because our compute cycles are growing and the amount of capacity we have is going up globally, just thinking about the global compute utilization doesn’t mean that energy’s going up at the same pace and great big reasons why we’re really efficient, our tools are better, for example, the phones are great, right? I can run my iPhone a lot longer than I could run my, you know, my phone from 10 years ago. The ability of the compute equipment to run at a higher optimum level via virtualization and then running in the most efficient locations, putting that workload instead of in a less efficient, smaller data center having in a very large efficient facility. When we think of people like Microsoft’s Azure platform or Amazon’s AWS or Google’s GCP, when you run workload there, it’s running in an extremely efficient place. That’s a fair description of each one of those?
Dr. Masanet: Yeah, absolutely. And I would say that the nuance about your phone lasting a lot longer than it did 10 years ago, you’re also getting a heck of a lot more service out of that same phone. So, it’s a perfect visual of getting more computational service for less energy as time goes on.
Raymond: Right, right, right. That’s an easy one and we all have them in our pocket, then that’s an easy one to get our arms around and understand. So, as we think about moving forward, you know, this is we turn around and we’ve looked back, you know, a decade and we said, “Okay, these trends have happened and this is why energy isn’t going up at the same pace as compute cycle.” What were your insights as you looked at the data about the way that power curve and meaning literally, you know, energy use and the capabilities and the compute as those two curves moving forward. Do you think they’re going to continue to look like they have over the last 10 years? What did you guys learn in that regard?
Dr. Masanet: Yeah. It’s a really great question. So, this is a good point…this is a good moment for me to point out that we really struggle in the analyst community with having data, on data center operations, on the energy use of servers, storage devices and so forth. And so, one of the backstories to this study is it took us an awful long time to put together the datasets that we could use with confidence in order to weigh in on global data center energy use. And reason for that is most data center operators don’t report their energy use. Server manufacturers may report some component level and energy use data, but in a very limited way and in a very small public dataset. So, we had to rely on a lot of inference from the data we had, a lot of talking to industry experts about the trends we were seeing. And this is one of the reasons why we want to promote more research in this area, and part of that, will be hopefully getting companies and data center operators meaning, you know, operators but by companies, I mean device manufacturers, server manufacturers to open up a bit sharing more data about the energy use and other characteristics of the equipment. But from what we could tell looking at the scientific literature, the empirical data we could find the datasets that are being reported.
You know, the energy use of servers is really tricky to pin down because on the one hand, we have these nice component level trends we’re seeing, so processes are getting more efficient. My colleague John Koomey’s has an effect that is named after him, it’s called Koomey’s law and it’s analogous to Moore’s law where John has been studying the amount of compute that we’re getting out of servers compared to the amount of energy we’re putting in for a number of years and has found that so-called Koomey’s law, which he didn’t name himself, but others named, that the computations per unit of energy has been doubling around every 2.6 years or so. But that’s been slowing down a bit as, you know, processes are getting close to their physical limits in terms of you know, the hardware and so forth. So, the short answer is looking at the data, looking at the trends in energy use of servers in particular, we’re finding that they’re certainly getting more efficient, but those efficiency gains are beginning to slow down. The other major effect is that servers that are being deployed are using more memory, more storage, and so forth. And so, we still have room to enjoy a lot of efficiency gains we think, but that pace is slowing down a bit. Meanwhile, demand is going up quite rapidly and we expect the rapidly rising demand trend for data center services, whether it be streaming or artificial intelligence or collaborative tools. As the world gets more connected, as we have greater data speeds, as we depend on the internet more and more for our daily lives, there’s no doubt in our minds that demand will increase.
We’re finding that from the data we’re seeing and in our conversations with hardware manufacturers that absolutely there’s still a lot of room for the devices to continue becoming more efficient. That’s slowing down a bit as we’re running up against, you know, the physical limits of the hardware. At the same time, demand is increasing even more rapidly. So, our conclusion was that there still is room for the industry to maintain this near plateau in energy use to meet what we said would be the next doubling of demand for compute instances. Beyond that though, given how fast we think demand will rise, it’s very likely that energy use will start ticking up again because efficiency won’t be able to keep pace with rising demand. That was our major conclusion. And so, we wanted to ring the alarm bell, so to speak, saying yes, we’ve been enjoying a lot of efficiency gains in the past and yes, we can continue to enjoy them in the near term, but at some point this decade we’re going to have to reckon with demand increases outpacing the ability of efficiency to keep up. And what do we do about that as a society? And who are the stakeholders who can help us manage that potentially large sorts of growing energy use?
Raymond: So, Eric, I want to get into this alarm bell idea for a second. But before we do that, I want to go back to Koomey’s law. So, Moore’s law I’m familiar with, I think a lot of folks in the tech business are very familiar with that. I think essentially saying that our ability to improve the compute cycle by 100%, doubling compute cycle somewhere between 12 and 18 months really is centered around the founder at Intel around the ability to put processors on silicone. I think that’s Moore’s law. Can you give me one more take on Koomey’s law on how to make sure…I think I understood what you said, but I’m asking you to say it one more time and then I’m going to see if I can paraphrase.
Dr. Masanet: Sure. So, Koomey’s law is based on observations of the computational power of servers and their energy use, and it’s been adjusted once. So, the very first study that Koomey did after which, you know, this, this term Koomey’s law came out, showed that the computations we were getting from servers for every unit of energy that we put into them that will double roughly every two years. Most recently when John went back and looked at more recent data, he found that the computations per unit of energy doubled roughly every 2.6 years, which suggests that the ability of servers to provide computations in an efficient way is slowing down slightly.
Raymond: Okay. So, I want to make sure I’m gonna use numbers here, even though they’re made up. So, what I think Koomey’s law is saying is if I could get 100 outputs for 10 kilowatts of power, in his modified number now, I could get 260 outputs for 100 kilowatts of power. In other words, I’m getting 2.6 years to double the compute capacity on the same amount of energy. Is that the right way to say it?
Dr. Masanet: So, I would say that if we take as our baseline 100 units of computes, let’s say per unit of energy, in 2.6 years, we’ll get 200 units of compute.
Raymond: Two hundred units. Okay.
Dr. Masanet: For the same amount of energy.
Raymond: For the same energy. All right. Thank you. That helps me pick [inaudible 00:17:35]. Whatever compute measure I have, I’ll get twice that much in 2.6 years for the same amount of energy output. But because of improving, and that’s back to your point of it appears that our ability to be efficient or achieve those efficiencies is slowing down a bit. But I’m still every 2.6 years being able to get twice as much compute power for the same unit of energy. Okay. That helps me understand. All right, very, very good. Okay. Back to this concept of an alarm bell, and can you give me just a little bit of a sense and I think it was mentioned in the article when I read it. If you think about the last decade, 2010 to 2020, the amount of compute, you know…so let’s use the same thing, the amount of compute power globally compared to the amount of energy globally, what’s been the rate of rise of those two numbers over the last decade? And I think if we understand that looking backwards, then that’ll tell us why you guys are considering, as you’ve looked at the data trending, the other, you know, trending a little bit more quickly than it has over the last decade why we ought to be thinking about this. What did it look like over the last decade, those two numbers?
Dr. Masanet: So, those two numbers. So, just to clarify that the two numbers are the amount of computes globally and the energy inputs into the data centers.
Raymond: Right. And at a super macro level, which I ultimately think is sort of the spirit of the study, how do those two numbers look compared to each other?
Dr. Masanet: Yeah. That’s a really good question and I should be careful, I need to frame this quite carefully. So, in our study, we looked at the period 2010 to 2018, so it’s not quite a decade.
Raymond: Okay. Got you.
Dr. Masanet: And the reason we did that is it takes a few years for data to appear and the most robust assessments are always retrospective because, you know, the technology in the IT sector and frankly, data center operations change quite quickly. So, it’s typically most credible to look back when you have reasonable data on the past to weigh in on these trends. So, let me take the first number, which is the amount of compute globally. So, we don’t know that number precisely. What we had to use in our study as a reasonable proxy was the number of compute instances. And this was defined by Cisco. They have a report which they’ve been publishing each year for roughly the last decade. The Global Cloud Index Report (GCI) where they estimate the number of workloads and compute instances running in the world’s data centers. But we don’t necessarily know the computational intensity of each of those. We have to take an average. But if we use that value as a proxy, the number of compute instances that has gone up by more than sixfold over the period, 2010 to 2018. So, a sixfold…
Raymond: Sixfold, 600%?
Dr. Masanet: Sixfold. Yes. Yeah.
Dr. Masanet: Factor of six. Over that same time period, our best estimates suggest that the energy used by data centers has only risen by about 6%. So, that is we calculated in the paper that equates to an energy intensity reduction if we take the amount of energy in the numerator and the number of compute instances in the denominator, so, energy per compute instance, that has gone by around 20% every year. And that pace of efficiency improvement is far greater than we can see in any other sector of the energy system, global industry, global aviation, global transport sector. And we wanted to point out that even though the data center industry sometimes gets beaten up a bit about its energy use or about its contributions to climate change, that is really a remarkable efficiency improvement and better than nearly any other sector for which we have data.
Raymond: Okay. So, we’re on a data center podcast, Eric, so I’m going to ask you to drive this one home. I think what I just heard you say is if you run the math, it’s essentially the data center industry, or the compute footprint on the planet has improved its efficiency by about 20% a year and stacked up against any other industry. That’s fantastic. Is that a simple way of saying what I think I just heard you explain?
Dr. Masanet: I think that’s a good description of it. We couldn’t find any other sector that has improved its efficiency per unit of service provided anywhere near what the data center industry has been able to accomplish over the last decade.
Raymond: And that’s 20% per year, it’s not like over the last decade, the industry has figured out how to get 20% better. It’s delivered that kind of performance on average for several years, for almost the last decade. Is that fair? Because it’s 600% growth and compute footprint with only a 6% growth in electricity consumed. I mean, that’s consistent performance year after year after year. That’s a fair way to say it, right?
Dr. Masanet: That’s a fair way to say it. Yes, 20% reduction in energy intensity per year. Granted the caveat there is that we’re looking at the number of compute instances hosted in global data centers, that’s our proxy, but we felt it was a reasonable proxy to show that the energy required for a unit of service that’s delivered by a data center has been dropping rapidly.
Raymond: Yeah. Well, I think that historically great job on the technology industry, great job by the data center industry, a great job on, hey, we’ve got to manage this proliferation of compute cycles in a wise way and make sure that it doesn’t get out of balance from an energy consumption standpoint, but also just what an incredible tool to solve so many other problems, right? I mean, there’s all these incredible ancillary benefits and as a consumer, right, I think of the easy ones, like, you know, Uber has changed my ability to get around, especially when I’m out of town or my kids love getting food delivered to their…you know, just the convenience of Uber Eats or Postmates or Grubhub, those are simple consumer understandings of how technology helps us. But I mean, there’s an incredible set of benefits that come along with this increased energy use that I think helps tilt the scale towards technology being an overall benefit. And I’m not asking you to weigh in necessarily from a global energy footprint but from what you guys studied, hey, it seems like the industry is doing a heck of a job. Is that a fair statement?
Dr. Masanet: I think that’s a fair statement. Now, you know, trying to understand the net benefits of digitalization is notoriously very difficult, partly because we don’t have great data on the entire system, but partly because you have to set up, you know, the calculus for understanding whether streaming is better than, you know, going to the video store, which frankly nobody does anymore, or teleworking is better than commuting. You need to look at a really broad system. You need to look at, on the one hand, I’ve now eliminated a commute, but what did I eliminate? Was it public transit or was it me driving 30 miles each way as a single rider in an SUV? You know, the benefits of eliminating that commute really depend on what’s being eliminated. And then on, you know, the teleworking side, if I’m at home now with my air conditioning on and all the lights and music blaring versus if, you know, I’m just using a little bit of additional energy to connect my computer to the internet, the benefits of any of those shifts really depend on the specific situation and teasing that out has always been difficult for the research community. But in general, what we’re seeing from the literature is that digital services, because we’re moving bits and not physical, you know, stuff, digital services are generally much more environmentally efficient than the physical services that they replace.
Raymond: So, you mentioned in the study and you mentioned in our conversation here that you guys are doing the best with the data that’s made available and that you’d love to see the industry make more data about their facilities and about their compute devices available specifically around energy usage, that it can be hard to know exactly, but you think it’s a good proxy. I love that term, using the Cisco GCI as a great proxy. As we think about a large data center and getting all the reports around 50 or 100 or 200-megawatt data center, right, we can get a lot of information out of that one building. As the trend of edge computing begins to pick up pace and where our compute cycle sits starts to distribute, do you think that’s a net positive for energy utilization, a net negative, or is it just a whole nother set of problems around reporting? And maybe I’m teeing up the question too much, but how do you guys as your group think about edge computing changing this equation?
Dr. Masanet: Yeah. Well, edge computing is certainly something that needs a lot more study. So, we think there are three trends that really need to be better understood. One is artificial intelligence, which could require a lot of computational energy intensity or computational intensity. The second is 5G, which could spur lots of new demand for data center services. And the third, as you mentioned, is edge computing moving potentially some workloads and some compute instances much closer to the end-user in smaller data centers. And we really don’t know yet how that’s going to play out from an energy perspective. On the one hand, we can imagine it bringing us, you know, back in time to having many smaller data centers that have you know, perhaps less efficient cooling systems, less ideal you know, infrastructure systems, and so forth. And that could drive energy use up. But if they’re well-managed edge data centers, meaning we’re using the most efficient equipment, it’s run at high capacity utilization, PUEs are kept at their practical minimum, then perhaps, you know, the net effects on energy use will be more minor. But the fact of the matter is we don’t really know and it’s a topic of interest that really needs a lot more exploration from an energy analysis perspective. And it’s really on the agenda, I think for us, my colleagues, but also other data center energy researchers getting a handle on AI, 5G, and the edge, I think is the most important job we have as analysts for weighing in on where data center energy use may go in the near future.
Raymond: Well, Eric, I think those are all big ones that you hear talked about a lot in our industry and certainly are going to have a major impact on not only how we do business and what we engage with from technology perspective, but how they impact energy use, which is certainly right in your wheelhouse. Thank you so much, Eric.
Dr. Masanet: Thank you, Raymond. Really great questions, by the way. It was a lot of fun talking to you.