Since the first computer rooms of the 60’s, airflow has been an important, but often misunderstood component of data center design. With the low density computing of the past, it didn't need to perform very efficiently to do its job. Today however, with high performance servers doing many times the work of their predecessors in a much smaller space, data center airflow needs to keep pace.
The first tenet of effective and efficient data center cooling and airflow is to “reduce or eliminate mixing”. What this means in absolute terms is, 1: Cool air from the cooling unit (CRAC) supply goes only to server intakes. 2: Hot air from the server exhausts goes only to the CRAC returns. The second tenet is to raise CRAC return temperatures as high as possible while maintaining proper server inlet temperatures. This allows the CRAC and heat rejection equipment to run at highest efficiency. Done properly these changes will result in reduced operating expenses with safe server temperatures. How then does the typical data center manager judge how well his site is performing and make improvements? For the purposes of this article we will discuss the typical raised floor data center, but the basic concepts are the same for any data center.
Data center layout is the first thing to be examined. The most efficient layout will have racks in a Hot Aisle / Cold Aisle configuration and CRACs able to draw hot exhaust easily down the hot aisles to their returns. This usually means that CRACs should be perpendicular to the rows. However if a ducted ceiling return plenum is used with egg crate grilles placed in the hot aisles, the placement of CRACs is not as critical.
Assuming that a Hot Aisle / Cold Aisle configuration exists or can be created, what are the best practices to use? Rows ideally should be 8 or more racks long with a full 2 tile wide cold aisle. Racks fronts should be even with the tile edges so that any tile in the cold aisle can be removed or replaced to allow adjustment of cold air volume. Racks in a Hot Aisle / Cold Aisle should have full blanking panel coverage but if 100% cannot be achieved then prioritize tops of cabinets and anywhere hot air intrusion into the cold aisle can be felt. One of the biggest issues with blanking panels is making sure they are replaced when work is done in racks. Racks should be 3 full tiles away from the CRACs to minimize low flow from tiles that are too close to CRACs. Where possible, place less server load in racks at the ends of the aisles and higher loads in racks towards the middle of the row. Additionally load racks from bottom to top, keeping the highest server loads lower in the racks.
Cable cutouts under the racks, usually at the back, should be examined. If they are large then consider closing them off, again to reduce mixing. If they are small, say less than 3” x 3” actual open space after cable blockage is taken into account then they can generally be ignored, unless there are a great number of them. Also inspect areas in the whitespace for cool air leakage from under floor. Behind CRACs, under PDUs and UPS equipment are all common leakage points. This is an air plenum, so only approved fire retardant materials should be used here. Reducing leakage into white spaces and hot aisles helps reduce mixing, raise CRAC return temperatures and raise under floor plenum pressure.
In a typical well performing “non-contained” cold aisle, there will be a ~10°F difference between the bottom of a rack and top of a rack. If this temperature difference is much lower, then the rack or row is overcooled, if it is higher then there may not be enough cooling air for the rack or row. You are trying to reduce mixing, so want to provide only as much cooling as needed. Assuming enough total cooling capacity is available then the type and amount of perforated tiles in the cold row can be adjusted to get the desired ~10°F difference from bottom to top of rack.
To help understand how mixing affects efficiency, let’s study this desired ~ 10°F difference bottom to top of rack. This is actually a real world example of “controlled mixing”. The 60-65°F air in introduced into the cold aisle at floor level from the perforated tiles is mixing with 80-90°F air that is found at the ceiling area. With the proper amount of cooling coming from the perforated tiles, the air mixes and rack top temperatures are maintained in the safe 75°F region. If we observe lower rack top temperatures, (say 68°F) then we have more cooling than needed and less mixing in front of the racks but more mixing is taking place above the racks and ultimately lowering the air temperature that goes to the CRAC returns; this reduces cooling efficiency. If we observe higher rack top temperatures, then we don’t have enough cooling at that rack, mixing is taking place directly in front of the rack in the form of wrap around, spill over or pull through of hot exhaust air. This usually does not affect CRAC efficiency but can lead to rack or server overheating.
In a typical data center operating CRAC cooling supply should be 20-30% higher than the IT load requires. This allows the room to be properly airflow balanced with good rack temperatures and relatively high CRAC return temperatures. If CRAC cooling supply is too great then there will always be too much overall mixing and lower CRAC cooling efficiency. If there is too little CRAC cooling capacity then there will be persistent hot spots. The goal is to be able to run the data center with a slight oversupply of cooling, but not too much.
Since we know we need to balance cooling supply to the racks and support the highest kW racks, it should come as no surprise that a sufficient and predictable under floor plenum pressure is needed. This means that depending on data center size and density, an ideal raised floor will have a plenum depth of 16-24”, even more for very high density sites. Many data centers, especially older ones, have a shallower underfloor plenum than this. In this case it is critical to keep underfloor obstructions such as chilled water pipes, power cables, data cables and cable trays to a minimum. Overhead cable trays are helpful to keep data cabling out of the underfloor. The more underfloor obstructions, the less even the pressure distribution and thus cooling supply will be. Large % open grate style tiles can be used to provide enough cooling air to some racks. Be cautious however, using too many open grate style tiles makes maintaining good even plenum pressure difficult. In the case where there is a local mismatch of plenum pressure and IT load then under floor air mover tiles, using fans, can be used to insure cooling air can get to where it is needed without running excess CRAC units, which reduces efficiency.
Containment is becoming a popular strategy for airflow management in data centers. It’s attractiveness is that it uses a physical barrier to reduce, or in some cases completely eliminate mixing which allows higher CRAC supply and return temperatures. This not only improves normal CRAC efficiency but also potentially extends free cooling hours since the CRAC supply temperatures can be raised by typically + 10°F from a non contained data center. Both Hot Aisle Containment and Cold Aisle Containment are possible. Efficiency wise, there is very little difference. There is speculation that Hot Aisle Containment will have a longer ride through in the case of a cooling failure due to the much larger volume of cold air in the main data center space. This can be true for lower density installs but as densities increase this difference diminishes. Cold Aisle containment can be more susceptible to under floor pressure differences and may require some type of aisle pressure management to insure enough cooling under all conditions such as CRAC failures.
In a new build, or with a ducted ceiling return plenum, Hot Aisle Containment is practical to implement. In an existing site, cold aisle containment is most often easier. In both cases proper attention must be paid to the local fire codes prior to installation.
CRAC setup is the last area to look at. CRAC units should be ideally running at high return temperatures and producing cooling at 50% of capacity or more. Idling CRACs dilute the under floor plenum with warm air and cause mixing under the raised floor, another source of inefficiency. To allow fewer CRACs to run at higher efficiency, a redundancy strategy can be implemented which puts CRAC units in hot-standby. This allows them to come on line when needed, such as in a nearby CRAC failure, but remain off when not needed.
Every data center is different and has varied airflow challenges and opportunities but the subset of problems in each has known and proven solutions. Airflow management, properly applied, will reap benefits in energy efficiency, IT capacity and server reliability with an attractive ROI.
Director of Engineering AdaptivCool