Cooling Tips

Ten Tips For Cooling Your Data Center

Even as data centers grow in size and complexity, there are still relatively simple and straightforward ways to reduce data center energy costs. And, if you are looking at an overall energy cost reduction plan, it makes sense to start with cooling costs as they likely comprise at least 50% of your data center energy spend.  Start with the assumption that your data center is over-cooled and consider the following:

Turn Off Redundant Cooling Units.  You know you have them, figure out which are truly unnecessary and turn them off. Of course, this can be tricky. See my previous blog on Data Center Energy Savings.

Raise Your Temperature Setting. You can stay within ASHRAE limits and likely raise the temperature a degree or two.

Turn Off Your Humidity Controls. Unless you really need them, and most data centers do not.

Use Variable Speed Drives but don’t run them all at 100% (which ruins their purpose). These are one of the biggest energy efficiency drives in a data center.

Use Plug Fans for CRAH Units. They have twice the efficiency and they distribute air more effectively.

Use Economizers.  Take advantage of outside air when you can.

Use An Automated Cooling Management System. Remove the guesswork.

Use Hot and Cold Aisle Arrangements. Don’t blow hot exhaust air from some servers into the inlets of other servers.

Use Containment. Reduce air mixing within a single space.

Remove Obstructions. This sounds simple, but  a poorly placed cart can create a hot spot. Check every day.

Here’s an example of the effect use of an automated cooling management system can provide.

The first section shows a benchmark of the data center energy consumption prior to automated cooling. The second section shows energy consumption after the automated cooling system was turned on. The third section shows consumption when the system was turned off and manual control was resumed, and the fourth section shows consumption with fully automated control. Notice that energy savings during manual control were nearly completely eroded in less than a month, but resumed immediately after resuming automatic control.

Occam’s Razor

Data Center Energy Savings

The simplest approach to data center energy savings might suggest that a facility manager’s best option is to turn off a few air conditioners.  And there’s truth to this.  See the graph below, showing before and after energy usage, and the impact of turning off some of the cooling units.

Before & After Energy Management Software Started

But the simplicity suggested here is deceptive.

Which air conditioners?

How many?

How will this truly affect the temperature?

What’s the risk to uptime or ridethrough?

While turning things off or down is likely our greatest opportunity for significant, immediate savings, the science driving the decision of which device to turn off and when, is complex and dynamic.

Fortunately, a convergence of new technology – wireless sensors for continuous, real-time and location-specific data, along with predictive, adaptive software algorithms that take into account all immediate and known variables at any given moment – can predict the future impact of energy management decisions, taking on/off decision-making to a new level.  Now, for the first time, it’s possible – thanks to the latest AI technology – to automatically, constantly and dynamically manage cooling resources to reduce average temperatures across a facility and avoid hot and cold spots of localized temperature extremes. Simultaneously, overall cooling energy consumption is reduced by intelligently turning down, or off, the right CRACs at the right time. The result is continually optimized cooling with greater assurance that the overall integrity of the data center is preserved.

Bay Area Talks

Data Center Energy Management Presentations

Heads up on a couple of my upcoming presentations in the Bay Area on June 21!

TiE Silicon Valley

The Silicon Valley TiE office is hosting a panel discussion on energy alternatives for data center management.   I’ll join four executives from the industry to discuss and debate:

  • Load management
  • Replacement of existing infrastructure
  • Cooling Management

We’ll discuss trends, tradeoffs and the effects of options in these areas.

For details, click here: http://sv.tie.org/event/sig-energy-energy-management-data-centers

Stop by and say hello!

Environmental Defense Fund Climate Corps Program

The EDF sponsors a terrific program to place MBA fellows at large companies to collect energy data, analyze accordingly and provide
recommendations.  I will provide a private webinar to these interns on energy efficiency strategies for data centers.

Avoiding Risk

Avoiding Risk in Data Centers Sometimes Means Counter-Intuitive Thinking

Sound data center risk mitigation practices can also lead to energy cost savings. But sometimes the route there is counter-intuitive.

Always-on, always-cold is still a commonly-used strategy for data center cooling operations – and for good reason. This type of operation is fairly easy to implement and monitor, and running all the CRACs all the time logically reduces the risk of downtime should a unit fail.  In this operating strategy, the CRACs are run at a low set point.   They operate at  lower than required temperature to mitigate the risk of hot spots or to add ride-through time in the event of a cooling system failure.

While this seems a logical and prudent practice, if you dig a little deeper, you’ll see that it’s not quite as risk adverse as it initially appears – and, more importantly, it misses a larger opportunity for significant energy cost savings. Let’s examine each practice individually.

Continuous operation of all CRACs, including redundant (backup) CRACs, wears all units out prematurely. Increased runtime for any piece of equipment that wears out with use naturally reduces its lifecycle.

Leveling CRAC runtimes, in which each CRAC is set to run approximately the same number of hours, has the same issue. This practice might extend the time to first failure; however it also increases the risk of catastrophic failure (i.e. simultaneous failure of all units).

And then there’s the issue of low setpoint thresholds.  Common thinking regarding cold operations is that an overall cooler temperature will use the thermal mass of the infrastructure to provide extra time to react in the event of a cooling system failure.  However, when all CRACs operate equally, each CRAC runs at a lower (less efficient) utilization, meaning that the discharge air from each CRAC will be higher.  Some CRACs, in effect, may not be cooling at all, which means that in a raised floor data center, those that are not actually cooling are blowing return air into the underfloor plenum.  Since the largest source of thermal mass in a data center is the slab floor, this means that this ”always-on”, and/or low set point approach to CRAC operation may not yield the best utilization of thermal mass.

A “just-needed” operation policy is preferable in terms of both catastrophic risk mitigation and energy efficiency.  In this case, the most efficient CRACs are operated most of the time, and the less efficient CRACs are kept off most of the time – but held in ready standby.   Even when CRACs are nominally the same, there can be significant differences in their cooling efficiency due to manufacturing variability.  These differences, if measured or characterized, can be utilized to further optimize efficiency and mitigate the risk of catastrophic failure.

Sometimes the obvious or even most commonly used cooling strategy isn’t the best strategy, particularly as rising energy costs become more of a concern.   An operating strategy that recognizes and anticipates the possibilities of “little failures,” while focusing on the avoidance of catastrophic failure and reducing energy costs, is not only forward looking but also represents best practice.