How better cooling systems can affordably reduce downtime

How better cooling systems can affordably reduce downtime

Data center cooling is critical to preventing downtime.

The ever-increasing cost of downtime combined with the need to achieve energy efficiency is pressuring data center managers to improve their cooling systems. Historically, the solution to rising temperatures was to crank up the cooling capacity. And while this may have helped to stave off downtime resulting from overheating equipment, it certainly wasn't doing the energy budget any favors. 

The question now is how can we improve cooling systems to reduce the risks of downtime – without breaking the bank?

It's not a matter of cooling capacity

The first and arguably most important thing to understand about data center cooling failures is that they very rarely originate with the computer-room air conditioning units. A CRAC failure certainly has the potential to devastate a data center's environmental conditions, but according to TechTarget contributor Vali Sorell, that's not the main culprit.

"In most cases, the problem isn't one of insufficient capacity, but of poor air flow management," Sorell wrote. 

Specifically, Sorell noted that if cool air passes around equipment, or never actually reaches certain cabinets or racks, increasing the cooling capacity won't solve the problem. Over time, hotspots will be allowed to develop. Server fans will have to work harder. Meanwhile, temperatures surrounding equipment will continue to increase. In some cases, these imbalances may be augmented by other events – for instance, summer heat waves or high levels of power consumption during peak hours. In the case of one Toronto facility several years back, a weather-induced power outage coupled with a CRAC failure induced overheating that very nearly resulted in downtime. 

The moral of the story is that if for whatever reason your CRAC were to fail, and you already have multiple hot spots in your facility, the likelihood of heat-induced downtime spikes.     

Data center uptime is critical to business continuity. Data center uptime is critical to business continuity.

It all comes down to proper airflow management

"The solution is to use intelligent airflow management."

The best way to improve the distribution of cool air throughout racks and cabinets isn't to increase cooling capacity. It's to make sure that cool air is actually reaching equipment, and that hot air is being contained and directed into return plenums where it can be treated. 

In high-density data centers, this isn't always easy. Racks or even entire cabinets that are farther away from the cool-air source may struggle to maintain zero pressure. Rather than rising into a return plenum, warm air mixes back into the room, where it stagnates around equipment, causing dangerous hot spots. 

The solution is to use intelligent airflow management – and specifically, a system that employs pressure sensors capable of "talking" to fans embedded in the containment chamber. These fans can thereby respond to temperature increases by rotating faster, ensuring that hot air is contained and expelled, even in racks and cabinets that aren't as directly exposed to treated air. As a result, hot spots are eliminated without having to blast the AC.

More importantly, intelligent airflow reduces the chances of overheating by keeping the entire facility at an ideal temperature at all times. So if something does go wrong, pre-existing hotspots will be one less thing that you have to worry about.