Do the cooling challenges faced by data centers give you cold sweats? Don't fret! Kar-Wing Lau, the CTO of LiquidStack, is here to enlighten us on the power of liquid cooling technology. This episode is a deep dive into the benefits of liquid cooling for data centers and homes. We discuss how two-phase immersion cooling not only reduces energy consumption but also saves up to 90% of space compared to traditional methods. Particularly, high-powered semiconductors like CPUs and AI chips are benefiting immensely from this technology.
Our conversation also shifts towards the increasing need for sustainable data centers. We explore AI-enabled data centers and high-performance computing, discussing how liquid cooling technology can help combat the rising costs and energy consumption. Kar-Wing explains why the world urgently needs more environmentally friendly solutions and how liquid cooling can contribute to this change. Plus, we also touch on the unique issues faced by data centers in Hong Kong and how this technology can leave the world in better shape for future generations. Join us as we bring the future of cooling in the data center industry into sharper focus.
More about Kar-Wing:
25+ years of Operations, Business Process Improvement and IT Consulting experience, leading large and diverse teams. Successfully improved productivity across various industries, with special expertise on IT and Logistics. Pioneered 2-phase Immersion Cooling technology with Hong Kong's most energy efficient Data Center, revolutionizing High Performance Computing cooling by saving more than 99% on cooling electricity and 87% on space.
0:00:01 - Mehmet
Hello and welcome back to a new episode of the CTO Show with Mehmet. Today I'm very pleased to have with me Kar-Wing, the CTO of LiquidStack, Kar-Wing, the way I like to do it is I keep it to my guests to introduce themselves and let us to know more about themselves and about what you do.
0:00:20 - Kar-Wing
So it's good. Hi Mehmet, thanks a lot for having me on your show. I'm CTO of LiquidStack. I originally grew up in Germany, actually, and I founded an IT consultant company back then to optimize business processes. We're serving major German companies like Lufthansa, motorola Germany and so on, and after that I was taking up retail and global senior management roles at one of the top and logistic companies. And then, 2012, I co-founded a small company virtually to do Bitcoin mining actually, and we were trying to find out how cool down the Bitcoin mining chips in South tropical Hong Kong with a hot and humid summers. So it's such a small company originally.
I've touched everything from mechanical engineering, 3d design, cooling, obviously, piping design, the pressure drop simulation, leading down to cutting and creeping pipes myself and the integration of electrical systems and electronics, so I thought there was an insight into almost all technical aspects of our business.
It was very challenging back then, but we pulled it off and we were worldwide with the very first company which has deployed a large-scale two-phase immersion cooling and cooling technology for data set applications. And then, since 2012, we have four times consecutively deployed the world's largest immersion cooling data centers with up to 120 megawatts of IT load. And, just as comparison, the Hong Kong Stock Exchange on the data center, for example, has an IT load of only eight megawatts, so it's quite large actually. And then later we rebranded the slickness deck and we continued to research and develop liquid-cooling solutions into sustainable and energy-efficient high-performance cooling products, and our company has grown very substantially, with employees in multiple countries in step. My role as CTO is to lead an organization with research and engineering teams in Asia and the USA. I provide strategic direction to our teams and consult and advise, of course, on technical matters where necessary, and obviously I need to stay up to date with technology to understand where the market is moving to and what is in demand.
0:02:33 - Mehmet
That's great Karwin. I really like the story and what was going behind. So if you can just tell me when exactly was the moment you realized that traditional data center cooling is not working anymore, something needs to be changed.
0:02:58 - Kar-Wing
Right, so maybe a little bit on background between, like, air cooling and liquid cooling, that provides a little bit more understanding on the matter. So air cooling obviously requires air. It's a medium for transport heat away, but it's very ineffective in transport heat away versus liquid cooling. And since it's so ineffective, it actually requires much more surface area than a normal chip could provide.
And so typically you find a very big and bulky heat sinks mounted on top of CPUs and AI cell applications and some of those heat sinks, they can be now almost tall as four inches or 100 millimeter. It's a very astonishing scene. Some of those AI servers that from up to eight of these heat sink towers mounted on top had to weigh so much and require so much space, whereas the actual PCB with chips is only a fraction of the volume. And back then when we were setting up our hot-cold mine, we were really benchmarking various cooling technologies against each other. So, for example, we've been looking at the latest and greatest off of air cooling. We have been looking at, for example, our DTC dive for chip of water cooling as well, with cold plates, as well as a single-phase immersion cooling and two-phase immersion cooling. And after benchmarking those different technologies with each other, we saw that two-phase immersion cooling was really the most suitable one for our technology back then.
And the great advantage of liquid cooling versus air cooling is really the much higher thermal conductivity.
What we see is two-phase immersion cooling, so we use dielectric fluids, which means that they're not conducted any electricity and unlike, for example, air cooling where you have one or few heat sinks mounted on top of one of the chips.
Our electronics are completely immersed in that liquid, so it provides cooling all around and not just about focus on those areas where the heatsink has been mounted on top. So with two phase and what we use in the specific fluids, with a very new boiling product, without the 50 degree Celsius that's about half of the boiling temperature of water the heat transfer with a boiling phase change from liquid to gas is so effective in transporting heat away. It could even replace them with a big bulk of heat sinks with very thin copper boiling plates just a few millimeters in. We have built systems which were able, to say, as much as 90% in space versus air cooling using heat sinks. So just think about how much space you could save if you extrapolate that to a data center, and that really gave us some understanding in terms of like. This is very interesting, not potentially only for our own application, but also for the data center application.
0:05:51 - Mehmet
And Karwing, like you know from, because when I was researching so how this technology specifically address the problems associated with high power semiconductors. Right yeah, that's a great question.
0:06:11 - Kar-Wing
Nowadays, some of these high-plastomy conductors, some CPUs or AI semiconductor chips, they can consume something like 700 watts of power. I'm not sure. Maybe you still remember way back then there were like CPUs that maybe 10 watts, 15 watts, 20 watts, right? So it's a comparison to what's demanded as of today. So this process of immense challenges for cooling the big bulk of heat sinks are needed to cool down and in terms of these high-powered electronics, if you put them, a lot of them, together in a cell rack, for example, then multiply basically the cooling challenges.
So APC, for example, they have performed a study to show that a server rack of 20 kilowatt of power load would meet something like 2,000 cubic feet of air per minute. So if you would follow this kind of air volume air flow right through one foot wide, that like 30 centimeters, would generate wind speeds of about 56 kilometers per hour for one single server rack. And then those latest AI chips they are, for example, design recommendations for AI clusters where they even go up to 35 kilowatt per hour. So that's a factor of 1.75X on top of what I've just mentioned. That's really just mind-boggling how inefficient air cooling becomes at such power entities. So that's really what we saw, that liquid cooling you can save a lot of energy and that's really addressing these kinds of problems coming towards us, especially with the AI in general, in terms of generating AI especially.
0:07:55 - Mehmet
Yeah, actually, like you mentioned the use of energy, but I think you touched base also on the space required as well, right? So it's not just the energy that will be consumed, it's also the space that is required for the traditional one. Now a question out of mind that maybe it's a little bit kind of unusual question. We see this cooling technologies at data centers, but have you thought to make it because I know, for example, a lot of my friends. They like gaming and I see them they have these very weird cooling systems Is this something that it can be also be deployed for home users also as well? Yeah, how does this take?
0:08:41 - Kar-Wing
you back. I mean, we have, for example, some small systems which we have on our display, some of the exhibitions, and indeed we have been putting on our book samples, some normal gaming mainboards. In there we have up to four GPUs, I think even more five or six GPUs, and they were running basically some simulation. So it is possible as well to use it at home. But then our technology is really targeted more towards some of the data center application where you really have really very difficult to handle some situations on a whole down electronics.
0:09:21 - Mehmet
Now you mentioned that you rely on this technology but you don't use water. Why this is important?
0:09:36 - Kar-Wing
Well, nowadays, my data centers, for example, they have been continuous trying to save energy right.
So in the beginning they were using only, for example, air cooling and then something like air conditioning, very simplified. So they're using, for example, computer room air conditioners or computer room air handling units and those to operate those energy efficiently. Most of them, for example, they use evaporative cooling, meaning like they have a water tower or they have like water spray to make use of the evaporative cooling effect. So what it means is basically like if you sweat, in terms of the sweat as a water evaporating from the skin, it provides some cooling right.
And the problem with that is that with the increased demand of cooling for so many more data centers, the water consumption has increased dramatically. There has been, for example, some time ago I think I'm at the US NSA, the National Security Agency they wanted to set up a data center, I think in Utah, and the citizens of Utah they were actually protesting against the setup of such a data center because they were concerned about the massive amount of water consumption such a data center would have. So this really highlights that nowadays there are many aspects in terms of sustainable operation for the data center, which doesn't only go around in terms of reducing the electricity consumption, but as well as, for example, as well for cooling the water consumption. So we see a reverse from push towards regulations, for example towards a greener data centers.
0:11:26 - Mehmet
Yeah, now you mentioned AI a couple of times and of course we know it's AI and it's like the high compute environments. So for the audience that maybe they are not that technical, but can you elaborate a little bit on the challenges that AI poses on traditional data center infrastructure? And, of course, why would we need or require your technology for cooling?
0:11:52 - Kar-Wing
Sure, we've touched upon this a little bit. In terms of the high-part chips, and especially those AI chips, they can have 700 watts, some even more than that, for one single chip, but that's just about like the power consumption per chip If you take a look at the outer packaging size. But if you look deeper into the chip itself then, for example, you will see that the actual silicon die is even much smaller. So you have a very, very high heat flux, a little bit large chip packaging, meaning that, for example, for 700-watt chip on, for example, eight square centimeters of a silicon die, you have a very, very high concentrated power and heat. So these chips, for example, they can already achieve a 100 watt per square centimeter and that's actually about the same heat flux as inside a nuclear reactor.
And, just as a fun fact, most nuclear reactors actually use two-phase immersion cooling as the most effective cooling methodology and obviously it has to be reliable too. So these are figures for today's AI chips, right like what's available as of today. But if you look, for example, at the chip roadmaps, what's coming up in the future, they have even higher heat fluxes. So the problem of heat generated by AI applications and similar high-performance computing applications. Those issues are not going to go away. They're actually becoming worse. So it's really something where we see that there's no other choice to sustain the operation of AI and HVC application than looking into our users of cooling.
0:13:36 - Mehmet
Got it. Now how you are seeing the. I mean where we are going with this. I mean, in other sense, how data centers in usual, in general, must evolve to keep supporting this growing demand of low latency compute and edge AI. You know, like you touch base from cooling perspective, but also like because you have to deal with a lot of maybe some of your customers who have these large data centers, so what they should do different to keep the support for this high demand that is coming because of the AI and high performance computing.
0:14:20 - Kar-Wing
In general, it's important for customers to keep in mind whether they are able to implement new technologies to operate in their data centers for energy efficient. So, for example, some time ago we have created a data center study page that's US based on a data center architecture consulting firm, and then we have created a study of a hypothetical 36 megawatt data center to compare two phase immersion cooling with air cooling, and some of those key points were, for example, that if using two phase immersion cooling, we could save around 32% on the data center space and about 61% of the white space that's actually like where the server is located, and this and other savings led to an overall saving of about US$123 million, so about a third of the entire cost. Well, thank you very much. Data center operators if they are looking into high-performance and AI applications, I think it would be worth it to keep an open mind in terms of what kind of from the code, technologies and other technologies could be implemented to make their data centers more efficient.
0:15:42 - Mehmet
Yeah. Now question Karwin like we started to see I live in Dubai and we started to see you know everyone. You know because there's the COP28 coming and you know there's a lot of talks about sustainability. Actually, they have built one of the largest data center, which is powered by solar power actually. So for such initiatives, how you know liquid stack, can you know add, you know value for you know countries, companies who want to have this sustainability as part of their long-term strategy and, of course, decreasing carbon footprint and so on.
0:16:28 - Kar-Wing
Yeah, that's just a point. Thanks for raising that. So, obviously, locations like the UAE, dubai right, and as well as Hong Kong, singapore, all with very challenging climates, it's not going to be easy to operate on data centers in the future. So you see, for example, very low on the E-figures being like power usage effectiveness figures in countries where it's a rather moderate in terms of climate, not so hot, but if you move to countries where you have a very challenging climate, then cooling of course becomes a very big challenge. And nowadays, as you mentioned, for example as well on the low latency applications, nowadays you are not necessarily able to always have a monolithic data center only once in a country. Several countries, for example, they have a certain number of data residency regulations, for example, that they prescribe data about their citizens and they should not cross borders. Or, for example, there are medical data which you want to keep number two, so hospital, if they are there. So what we see in terms of from Trent and Simmer, that there's certainly more focus on localization, that there are data center and concrete applications which are more localized so that they can serve the local users much, much better, but then, as mentioned, you may run into cooling challenges and so on, and this, combined with AI and HPC applications in general, really poses a significant challenge. So, for example, most enterprises and companies they don't operate data centers as some of the core business.
Not too long time ago, supercomputers were really something what government organizations and universities have said right. But if you have read, for example, recent use of Tesla launching a 300 million US dollar AI supercomputer with 10,000 NVIDIA acceleratorships, it's quite obvious to see that more and more enterprises and companies are seeing the need to use AI and compute the general to stay competitive. So office and combination really requires the demand for looking into a much more sustainable way to set up data centers. And then we mentioned, for example, data centers completely powered by solar electricity, and that's definitely a very great and correct way to approach this problem. But the solar cells are not a problem as well about energy storage. So on daytime it's all good, of course you have sufficient power for your data center, but at night time you need to find a very efficient and a storage center for that. And even then, if you spec out, for example, the solar cells or whatever generated energy generation for data center, if it's possible to save on the cooling side, we basically free up more electricity, for example, to run, instead, more compute. You can put, for example, more AI chips in there. You can put in more CPUs in there, instead of spending that electricity on cooling.
Then there are some data center locations where they have, for example, power issues. For example, singapore, they have up to 7% of its entire electricity being used for data centers. That's pretty astonishing, right, and for those, for example, they have already had something like a moratorium, meaning no new data centers are supposed to be given anymore, and they've changed it in the meantime, but they have a power gap. So, basically, data centers can only use a maximum number of megawatts, and there are already voices coming out in terms of what does it mean for the future growth of the data center industry in Singapore? In Singapore, it's not the only example. There are other examples, like the Southern Amsterdam and other countries as well, in cities where they have similar restrictions. So efficiency is really becoming more and more important, and how can you maximize what's available to still allow a sufficient growth capacity to stay competitive?
0:20:52 - Mehmet
You know this is really interesting point and, as you mentioned, data residency is one of the biggest challenge. It's not a challenge, like it said, you know, by regulations. Data cannot leave some countries, you know, especially if it's like very maybe personal identification data or medical records. You're completely right. And this is, you know, we saw the hyperscalers, you know, trying to start to build the data centers. Like, do you have any collaboration with these hyperscalers, you know, to provide your technology to them? Carwing also, because you know, like we, a couple of years back, I think, you know we start to see that the I think Microsoft were planning to build data center under the sea, you know, and we started to see, you know they were going to cold countries kind of to build the data centers. But, as you mentioned now, I can tell you, here in the Middle East, you know, we have, like in the GCC, six countries, so almost in each country now we have a data, two data centers for each one of the big hyperscalers. So do you have any collaboration with with these hyperscalers Carwing?
0:22:11 - Kar-Wing
Yeah, of course, in many commercial aspects, usually we have confidentiality agreements with our customers. It also applies to the hyperscalers. What I can say is definitely we are having a collaboration with hyperscalers, but I can't name exactly which ones.
0:22:28 - Mehmet
Yeah, sure, that's fair enough. Fair enough. Do you have any? You know, like, where are you spread today? Like Carwing, I don't know if you can tell me, like I mean, do you have any presence, for example, here in the Middle East, so people can go and find out about your solution, or is it like something which is usually it's in the hidden part that you know? Even tech professionals they will not see it. I would say face to face if I can't say this.
0:22:58 - Kar-Wing
That's an excellent question.
So we have actually, for our company size, quite a geographical coverage. As I mentioned, we are having a good presence in Asia and there's companies in the US and just very recently we actually made a hire in the Middle East specifically to support the Middle East in our sector and we will actually have, I think, in October. There is the Jitex exhibition coming up and we will have an exhibition before then and, of course, you're more than welcome to visit our exhibition booth over there and then we will be presenting our technology with a live demo on two-faced machine cooling. We can even see the system running, operating with all the bubbles while it's boiling, and I think it's very interesting to see and it's a very interesting market for us which we want to focus.
0:23:57 - Mehmet
Yeah, definitely so, just for the folks, if you are in the States or in Europe. So Jitex is the biggest tech event that happened here in the Middle East. It happened this year between 16th and 20th of October. My audience here in the region, they know about it and it would be for you a chance to visit I think you're going to have a booth covering, you said, so you can visit them and learn more about the technology.
Now just a couple of final points before we close. How, just from a city perspective, because you are on, as I was mentioning, it's not on something that people sense every day. So, from a city perspective, working on something high impact but yet less people know about it, how does it feel? And I'm asking you this question because some people sometimes they say, yeah, we want to change the world, we want to do something, but we want to be also touching the lives of people on day to day basis. But for you, actually, you are saving the people, but you are not too much on the front end, I would say. So from a city perspective, how do you describe this experience? It's a very interesting hit.
0:25:23 - Kar-Wing
Well, I'm not too keen in terms of like. I have a very big profile as a preventer. I believe really like in that they're not doing and really taking action is much more important If something comes good out of that, of course, and it's very much appreciated. So I mean they're very richly right. I'm a second company. I'm going to do Bitcoin mining.
Back then it was really just about how can we run it as profitable as possible, but then very quickly emerged that, especially in hot and humid Hong Kong, it would be only possible if, at the same time, it would also be reducing our electricity consumption, especially on the cooling. So very soon after, we wanted to explore if our technology also could be used for the data set industry. We have done some research on the data set industry in Hong Kong. I think we have seen that, for example, we have been like a power usage effectiveness PLE figures in Hong Kong for data centers in 2012 at 2.2. So just for the non-data center industry audience to explain PLE of 2.2 means that out of the entire data center electricity budget, you use only 45% to run your IT. So just imagine you built a huge data center, right, and you want to run servers, but actually you only use 45% of your electricity to run other servers. More than 50%, more than half is actually going into your facility like power, the cooling and the lights and stuff like that. But that's really like a very shocking when we saw how much energy is actually being wasted.
Of course, ple has improved in the meantime with the data technology and cooling in general, but we still see a very, very challenging problem with that. And then we've seen as well that Hong Kong has something about air quality, and it's actually, unfortunately, no big surprise if, considering that about two thirds of the electricity still being generated by burning fossil fuels. This, unfortunately, as well applies to many countries worldwide, not only Hong Kong. So, and on top of that, and Hong Kong landfills are already almost at capacity. So if we have a possibility, for example, to save on constructing materials by making the data center smaller and as well to save on cooling electricity by making the cooling technology more efficient, that's certainly becoming very strong motivators, and not only to me but to our team. But we have a much stronger focus on sustainability in general, and I'm especially now with having a young family as well, with child changes with respect to priorities. A lot right. So at the end of the world, to leave the world in a better shape for our next generation, not worse.
0:28:12 - Mehmet
Yeah, just for the. Also, if you allow me, karwing, if to add something for the folks, if you are non-tech at all and I doubt that you're not but you know, like every single application used today, it's powered by some server running somewhere, whether maybe in a telco provider or maybe in a cloud somewhere, and you know this. It's not only about the cost of the server and the cost of the application, as Karwing was mentioning. There's a huge things which are hidden in the background, like electricity. I was a data center technician at some stage and I know these things very well.
So you have the cooling and you know the cooling there, guys, it's not like the AC that you have at home. So the cooling system there you need to have like proper fans also for it, so to keep the air flowing, and you have from underground something and you have something that comes from up because, as Karwing was trying to explain, you know the chips are very sensitive to heat and especially when it's AI, they generate a lot of heat. So this is why you know this technology is very important. You know, don't just say, yeah, it's just cooling, it's like the AC, it's not like this, it's something more complicated.
And you know we are thankful, actually Karwing, for people who put the effort, like yourself, you know, and what you do with the liquid stack to make the world a better place where we can have, you know, less emissions, less carbon footprint, which is, you know, really something, as you mentioned. Especially when you have kids and children, you start to think about how we can leave this. I really appreciate that. You know all the valuable information. Is there anything, karwing, that you wished I had asked you? Did I miss anything you want to just add or say before we close?
0:30:02 - Kar-Wing
I think you pretty much touched on every single point of which was important, so it's just to add on what you just mentioned in terms of the every single action. There are some studies, for example, on sending emails, uploading a video, all these kinds of actions, some of the actually have data centers in the background running, operating to print scope, for example, the video into format and so that you can download it, and all different types of resolutions, and all of this, of course, is, yeah, I would say, almost inevitable. In the meantime, right, and it's not about, but how can we go back to stone ages? We need to live with the technology, we need to basically try to make the best out of it, and although people, for example, look at the generative AI as something like, okay, I can do some chitchatting, maybe to do my homework right, it's not as trivial as that there are operations out there.
For example, they use some gen AI to for example, find a markures for cancer and other illnesses, where, for example, they can do simulations and trying to find, like a lock to a key, a very special protein which exactly matches some of certain properties to cure illnesses, and these kinds of applications are very, very exciting. These applications require a lot of processing power, so this will not go away. We should releverage on the new technologies and the new technology. But how can we use this technology sustainably? That's where we hope to contribute a little bit in terms of using, for example, the cooling 100% and thank you for adding this, Karwing.
0:31:41 - Mehmet
Yeah, exactly, charge-up-ity guys. Maybe you see it as a game and you play to generate some text, but it's something more complex than this and each actually time you press the enter button, there's a lot of compute power, which is need schooling as well. That happens, Karwing, thank you very much for being on the show. I'll make sure that I will put the company website in the description notes and for the audience, as usual. You know this is how we end each episode. Thank you for the feedbacks. I'm pretty happy to hear that you are enjoying all the episodes. Keep them coming. If you are interested to be also on the show, don't hesitate to reach out to me. We can arrange for that and, as usual, thank you very much for tuning in today and we'll meet again very soon. Thank you, bye-bye. Excellent.
Transcribed by https://podium.page