Analytics in Action for Data Center Cooling

When a data center is first designed, everything is tightly controlled. Rack densities are all the same. The layout is precisely planned and very consistent. Power and space constraints are well-understood. The cooling system is modeled – sometimes even with CFD – and all of the cooling units operate at the same level.

But the original design is often a short-lived utopia. The realty of most data centers becomes much more complex as business needs and IT requirements change and equipment moves in and out.

As soon as physical infrastructure changes, cooling capacity and redundancy are affected.  Given the complexity of design versus operational reality, many organizations have not had the tools to understand what has changed or degraded, so cannot make informed decisions about their cooling infrastructure. Traditional DCIM products often focus on space, network and power.  They don’t provide detailed, measured data on the cooling system.  So, decisions about cooling are made without visibility into actual conditions.

Analytics can help. Contrary to prevailing views, analytics don’t necessarily take a lot of know-how or data analysis skills to be extremely helpful in day-to-day operations management. Analytics can be simple and actionable. Consider the following examples of how a daily morning glance at thermal analytics helped these data center managers quickly identify and resolve some otherwise tricky thermal issues.

In our first example, the manager of a legacy, urban colo data center with DX CRAC units was asked to determine the right place for some new IT equipment. There were several areas with space and power available, but determining which of these areas had sufficient cooling was more challenging. The manager used a cooling influence map to identify racks cooled by multiple CRACs. He then referenced a cooling capacity report to confirm that more than one of these CRACs had capacity to spare. By using these visual analytics, the manager was able to place the IT equipment in an area with sufficient, and redundant, cooling.

In a second facility, a mobile switching center for a major telco, the manager noticed a hot spot on the thermal map and sent a technician to investigate the location. The technician saw that some of the cooling coils had low delta T even though the valves were open, which implied a problem with the hydronics. Upon physical investigation of the area, he discovered that this was caused by trapped air in the coil, so he bled it off. The delta T quickly went from 3 to 8.5 – a capacity increase of more than 65 percent – as displayed on the following graph:

 

DeltaT

These examples are deceptively simple. But without analytics, the managers would not have been able to as easily identify the exact location of the problem, the cooling units involved, and have enough information to direct trouble-shooting action within the short time needed to resolve problems in a mission critical facility.

Analytics typically use the information already available in a properly monitored data center. They complement the experienced intuition of data center personnel with at-a-glance data that helps identify potential issues more quickly and bypasses much of the tedious, blood pressure-raising and time-consuming diagnostic activities of hotspot resolution.

Analytics are not the future. Analytics have arrived. Data centers that aren’t taking advantage of them are riskier and more expensive to operate, and place themselves at competitive disadvantage

A Look at 2014

In 2014 we leveraged the significant company, market and customer expansion we achieved in 2013 to focus on strategic partnerships.  Our goal was to significantly increase our global footprint with the considerable resources and vision of these industry leaders.  We have achieved that goal and more.

Together with our long-standing partner NTT Facilities, we continue to add power and agility to complementary data center product lines managed by NTT in pan-Asia deployments.  In partnership with Schneider Electric, we are proud to announce the integration of Vigilent dynamic cooling management technology into the Cooling Optimize module of Schneider Electric’s industry-leading DCIM suite, StruxureWare for Data Centers.

Beyond the technical StruxureWare integration, Vigilent has also worked closely with Schneider Electric to train hundreds of Schneider Electric sales and field operations professionals in preparation for the worldwide roll-out of Cooling Optimize.  Schneider Electric’s faith in us has already proven well-founded as deployments are already underway across multiple continents.  With the reach of Schneider Electric’s global sales and marketing operations, their self-described “Big Green Machine,” and NTT Facilities’ expanding traction in and outside of Japan, we anticipate a banner year.

As an early adopter of machine learning, Vigilent has been recognized as a pioneer of the Internet of Things (IoT) for energy.  Data collected over seven years from hundreds of deployments continually informs and improves Vigilent system performance.  The analytics we have developed provide unprecedented visibility into data center operations and are driving the introduction of new Vigilent capabilities.

Business success aside, our positive impact on the world continues to grow.  In late 2014, we announced that Vigilent systems have reduced energy consumption by more than half a billion kilowatt hours and eliminated more than 351,000 tons of CO2 emissions.  These figures are persistent and grow with each new deployment.

We are proud to see our customers turn pilot projects into multiple deployments as the energy savings and data center operational benefits of the system prove themselves over and over again.  This organic growth is testimony to the consistency of the Vigilent product’s operation in widely varying mission critical environments.

Stay tuned to watch this process repeat itself as we add new Fortune 50 logos to our customer base in 2015.  We applaud the growing sophistication of the data center industry as it struggles with the dual challenges of explosive growth and environmental stewardship and remain thankful for our part in that process.

 

Data Center Capacity Planning – Why Keep Guessing?

Capacity management involves decisions about space, power, and cooling.

Space is the easiest. You can assess it by inspection.

Power is also fairly easy. The capacity of a circuit is knowable. It never changes. The load on a circuit is easy to measure.

Cooling is the hardest. The capacity of cooling equipment changes with time. Capacity depends on how the equipment is operated, and it degrades over time. Even harder is the fact that cooling is distributed. Heat and air follow the paths of least resistance and don’t always go where you would expect. For these reasons and more, mission-critical facilities are designed for and built with far more cooling capacity than they need. And yet many operators add even more cooling each time there is a move, add, or change to IT equipment, because that’s been a safer bet than guessing wrong.

Here is a situation we frequently observe:

Operations will receive frequent requests to add or change IT loads as a normal course of business.  In large or multi-site facilities, these requests may occur daily.  Let’s say that operations receives a request to add 50 kW to a particular room.  Operations will typically add 70 kW of new cooling.

This provisioning is calculated assuming a full load for each server, with the full load being determined from server nameplate data.  In reality, it’s highly unlikely that all cabinets in a room will be fully loaded, and it is equally unlikely that the server will ever require its nameplate power.  And remember, the room was originally designed with excess cooling capacity.  When you add even more cooling to these rooms, you have escalated over-provisioning.  Capital and energy are wasted.

We find that cooling utilization is typically 35 to 40%, which leaves plenty of excess capacity for IT equipment expansions.  We also find that in 5-10% of situations, equipment performance and capacity has degraded to the point where cooling redundancy is compromised.  In these cases, maintenance becomes difficult and there is a greater risk of IT failure due to a thermal event. So, it’s important to know how a room is running before adding cooling.  But it isn’t always easy to tell if cooling units are not performing as designed and specified.

How can operations managers make more cost effective – and safe – planning decisions?  Analytics.

Analytics using real-time data provides managers with the insight to determine whether or not cooling infrastructure can handle a change or expansion to IT equipment, and to manage these changes while minimizing risk.  Specifically, analytics can quantify actual cooling capacity, expose equipment degradation, and reveal where there is more or less cooling reserve in a room for optimal placement of physical and virtual IT assets.

Consider the following analytics-driven capacity report.  Continually updated by a sensor network, the report clearly displays exactly where capacity is available and where it is not.  With this data alone, you can determine where capacity exists and where you can safely and immediately add capacity with no CapEx investment.  And, in those situations where you do need to add additional cooling, it will predict with high confidence what you need. (click on the image for a full-size version)

Cooling Capacity

Yet you can go deeper still.  By pairing the capacity report with a cooling reserve map (below), you can determine where you can safely place additional load in the desired room.  You can also see where you should locate your most critical assets and, when you need that new air conditioner, and where you should place it.

(click on the image for a full size version)thermalcircle

Using these reports, operations can:

  • avoid the CapEx cost of more cooling every time IT equipment is added;
  • avoid the risk of cooling construction in production data rooms when it is often not needed;
  • avoid the delayed time to revenue from adding cooling to a facility that doesn’t need it.

In addition, analytics used in this way avoids unnecessary energy and maintenance OpEx costs.

Stop guessing and start practicing the art of avoidance with analytics.

 

 

Cleantech Evolves

Smart Loading for the Smart Grid – New Directions in Cleantech

I recently participated in a TiE Energy Panel (The Hottest Energy Startups: Companies Changing the Energy Landscape), with colleagues from Primus Power, Power Assure, Mooreland Partners and Gen110.

The panel concurred that the notion of Cleantech – and the investment money that follows it – has shifted from a focus on energy generation to a focus on energy management.   To date, this is primarily because cheaper energy sources, hyped in early Cleantech press, haven’t materialized.  It’s hard to compete with heavily subsidized incumbent energy sources, much less build a business for what’s perceived as a commodity business.  There are exceptions, like solar energy development, but other alternative sources have languished financially despite their promise.

The investment shift toward energy management is also a result of emerging efficiency-focused technology.  Data Center Infrastructure Management or DCIM is all about smart management – with an emphasis on energy.  Gartner believes that there are some 60+ companies in this space, which is rapidly gaining acceptance as a data center requirement.

This shift is also supported by the convergence of other technology growth areas, such as big data and cloud computing, both of which play well with energy management.   As our increasingly sensor-driven environment creates more and more data – big data – its volume has surpassed the ability of humans to manage it.

And yet the availability of this data, accurate, collected in real-time, inclusive of the dimensions of time and location, represents real promise.  Availability and analysis of this information within individual corporations and perhaps shared more broadly via the cloud, will reveal continuous options for improving efficiency and will likely point to entirely new means of larger scale energy optimization through an integrated smart grid.

The days of facility operators running around with temperature guns and clipboards – although still surprisingly common today – is giving way to central computer screens with consolidated and scannable, actionable data.

This is an exciting time.  I’m all for new ideas and the creation of less expensive, less environmentally harmful ways to generate energy.  But as these alternative options evolve, I am equally excited by the strides industry has made for the smarter use of the resources we have.

The wave of next generation energy management is still rising.

More Cooling

More Cooling With Less $$

My last post took a look at the maintenance savings possible through more efficient data center/facility cooling management.  You can gain further savings by increasing the capacity of your existing air handling/ air conditioning units.  It is even possible to add IT load without requiring new air conditioners or at the least, deferring those purchases.  Here’s how.

Data centers and buildings have naturally occurring air stratification.  Many facilities deliver cool air from an under floor plenum.  As the air heats and rises, cooling air is delivered low and moved about with low velocity.  Because server racks sit on the floor, they sit in a colder area on average.  The air conditioners however, draw from higher in the room – capturing the hot air from above and delivering it, once cooled down, to the under floor plenum. This vertical stratification creates an opportunity to deliver cooler air to servers and at the same time increase cooling capacity by drawing return air from higher in the room.

However, this isn’t easy to achieve.  The problem is that uncoordinated or decentralized control of air conditioners often causes some of the units to deliver uncooled air into the under floor plenum. There, the mixing of cooled and uncooled air results in higher inlet air temperatures of servers, and ultimately lower return-air temperatures, which reduces the capacity of the cooling equipment.

A cooling management system can establish a colder profile at the bottom of the rack and make sure that each air conditioner is actually having a cooling effect, versus working ineffectively and actually increasing heat through its operation. An intelligent cooling energy management system dynamically right-sizes air conditioning unit capacity loads, coordinating their combined operation so that all the units deliver cool air and don’t mix hot return air from some units with cold air from other units. This unit-by-unit but combined coordination squeezes the maximum efficiency out of all available units so that, even at full load, inefficiency due to mixing is avoided and significant capacity-improving benefits are gained.

Consider this example.  At one company, their 40,000 sq. foot data center appeared to be out of cooling capacity.  After deploying an intelligent energy management system, not only did energy usage drop, but the company was able to increase its data center IT load by 40% without adding additional air conditioners and, in fact,  after de-commissioning two existing units. As well, the energy management system maintained proper, desired inlet air temperatures under this higher load condition.

Consider going smarter before moving to an additional equipment purchase decision.  Savings become even larger if you consider avoided maintenance costs for new equipment, and energy reduction through more efficiently balanced capacity loads, year-over-year.