- AI data centers overwhelming air cooling with increasing power and heat
- Cooling of the liquid becomes essential as the density of the server increases with the growth of the AI
- A new hybrid cooling cuts power and water but faces the adoption of hesitation
While the AI transforms everything, search engines into logistics, its hidden costs are becoming more and more difficult to ignore, in particular in the data center. The power required to execute a generative AI pushes infrastructure beyond what traditional air cooling can manage.
To explore the extent of the challenge, I spoke with Daren Shumate, founder of Shumate Engineering, and Stephen Spinazzola, director of critical mission services for the company.
With decades of experience in the construction of main data centers, they are now focusing on resolving energy and cooling for AI. From defaulting air systems to the promise of new hybrid cooling, they explained why AI forces data centers in a new era.
What are the biggest challenges to cool a data center?
Stephen Spinazzola: The biggest challenges in cooling data centers are power, water and space. With high density IT, such as data centers that execute artificial intelligence, comes an immense heat that cannot be cooled with a conventional air cooling system.
The typical loads of the cabinet doubled and tripled with the deployment of the AI. An air cooling system simply cannot capture the heat generated by the high loads of KW / Cabinet generated by the AI cabinet clusters.
We have carried out a calculation fluid dynamic (CFD) in many data center rooms and an air cooling system shows high temperatures above the acceptable levels. The air flows that we map with CFD show temperature levels above 115 degrees F. This can cause servers to stop.
Water cooling can be done in a smaller space with less power, but it requires a huge amount of water. A recent study determined that a single hyper-scale installation would need 1.5 million liters of water per day to provide cooling and humidification.
These limitations pose major challenges to engineers while planning the new generation of data centers that can support the unprecedented demand we see for AI.
How does AI change the standard in terms of heat dissipation from the data center?
Stephen Spinazzola: With CFS modeling showing potential servers that stopped with conventional air cooling in AI cupboards, the need for direct coolant (DLC) is necessary. The AI is generally deployed in 20-30 clusters of cabinets at 40 kW or more per cabinet. This represents a quadruple increase in KW / Cabinet with the deployment of AI. The difference is amazing.
A typical cat -gpt request uses approximately 10 times more energy than Google research – and it’s just for a basic generative function. The more advanced requests require much more power which must go through an AI cluster farm to treat large -scale computers between several machines.
This changes the way we think of power. Therefore, energy requests change the industry to use more liquid cooling techniques than traditional air cooling.
We are talking about a lot of cooling, what about the supply of the real power?
Daren Shumate: There are two new global challenges to provide energy to IA computer science: how to move the diet of UPS output cards to high density racks, and how to create a creative high density UPS power of the utility.
The supply to the racks is always carried out with the branch circuits of distribution PDUs at the Pdus rack (traffic jams) or with a rechargeable bus lane on racks with the pdu-rack connected to the bus lane at each rack. The nuance is now what Busway ampacity makes sense with scratches and what is trade available.
Even with a plug-in bus lane available for an extent of 1,200 A, power density requires the deployment of a larger quantity of separate bus track circuits to meet the density and striking requirements. The specific and variable requirements for power distribution are more specific to the end users of the Data Center for the surveillance of branch circuits or distribution preferences.
Depending on the site constraints, the conceptions of cooling the data centers may include average uses. Driven by concerns of voltage fall, the MV UPS solves concerns for the need to have very large banks of power channels, but also introduces new average voltage / use under-programs in the program. And when we consider the average voltage, another consideration is the applicability of Rotary UPS MV systems compared to static MV solutions.
What are the advantages / disadvantages of different cooling techniques?
Stephen Spinazzola: Today there are two types of DLC on the market. Cooling the emersion and cold plate. The cooling of the emersion uses large reservoirs of a non -conductive liquid with the servers positioned vertically and completely emitted in the liquid.
The heat generated by the servers is transferred to the fluid and then transferred to the refrigerated water system of buildings with a closed -loop heat exchanger. Emersion tanks take up less space but require servers configured for this type of cooling.
Cold cooling uses a thermal pendulum attached to the bottom of the chip stack which transfers the energy of the chip stack to a fluid which is hose throughout the cabinet. The liquid is then killed at the end of the online cooling distribution unit (CDU) which transfers energy to the building’s refrigerated water system.
The CDU contains a heat exchanger to transfer energy and 2n pumps on the secondary side of the heat exchanger to ensure a continuous flow rate to the servers. Cooling in the cold plate is effective in cooling the server, but it requires a huge amount of fluid pipe connections which must have disconnection leakage stop technology.
Air cooling is proven the technique for cooling data centers, which has existed for decades; However, it is ineffective for high density racks that are necessary to cool the AI data centers. As the loads increase, it becomes more and more difficult to install it using CFD modeling.
You present a different cooler, how does it work and what are the current challenges for adoption?
Stephen Spinazzola: Our hybrid-suchy / adiabatic design design solution (HDAC) pending patent (HDAC) uniquely provides two cooling liquid temperatures from a single closed loop, allowing a higher temperature liquid to cool DLC servers and a lower temperature liquid for conventional air cooling.
Since HDAC simultaneously uses 90% water less than a cooling system and 50% less energy than air coolant, we have managed to obtain the figure in electricity consumption efficiency (PUE) which is necessary for approximately 1.1 for the type of hyperscal data center which is necessary to treat AI. Typical AI data centers produce a stink ranging from 1.2 to 1.4.
With the lower stinking, HDAC provides computer power of approximately 12% more usable from the same power size food of the same size. The economic and environmental advantages are important. With a system that offers both an economic and environmental advantage, HDAC only requires “water sip”.
The challenge to adoption is simple: nobody wants to go first.