A Look at 2014

In 2014 we leveraged the significant company, market and customer expansion we achieved in 2013 to focus on strategic partnerships.  Our goal was to significantly increase our global footprint with the considerable resources and vision of these industry leaders.  We have achieved that goal and more.

Together with our long-standing partner NTT Facilities, we continue to add power and agility to complementary data center product lines managed by NTT in pan-Asia deployments.  In partnership with Schneider Electric, we are proud to announce the integration of Vigilent dynamic cooling management technology into the Cooling Optimize module of Schneider Electric’s industry-leading DCIM suite, StruxureWare for Data Centers.

Beyond the technical StruxureWare integration, Vigilent has also worked closely with Schneider Electric to train hundreds of Schneider Electric sales and field operations professionals in preparation for the worldwide roll-out of Cooling Optimize.  Schneider Electric’s faith in us has already proven well-founded as deployments are already underway across multiple continents.  With the reach of Schneider Electric’s global sales and marketing operations, their self-described “Big Green Machine,” and NTT Facilities’ expanding traction in and outside of Japan, we anticipate a banner year.

As an early adopter of machine learning, Vigilent has been recognized as a pioneer of the Internet of Things (IoT) for energy.  Data collected over seven years from hundreds of deployments continually informs and improves Vigilent system performance.  The analytics we have developed provide unprecedented visibility into data center operations and are driving the introduction of new Vigilent capabilities.

Business success aside, our positive impact on the world continues to grow.  In late 2014, we announced that Vigilent systems have reduced energy consumption by more than half a billion kilowatt hours and eliminated more than 351,000 tons of CO2 emissions.  These figures are persistent and grow with each new deployment.

We are proud to see our customers turn pilot projects into multiple deployments as the energy savings and data center operational benefits of the system prove themselves over and over again.  This organic growth is testimony to the consistency of the Vigilent product’s operation in widely varying mission critical environments.

Stay tuned to watch this process repeat itself as we add new Fortune 50 logos to our customer base in 2015.  We applaud the growing sophistication of the data center industry as it struggles with the dual challenges of explosive growth and environmental stewardship and remain thankful for our part in that process.

 

Data Center Capacity Planning – Why Keep Guessing?

Capacity management involves decisions about space, power, and cooling.

Space is the easiest. You can assess it by inspection.

Power is also fairly easy. The capacity of a circuit is knowable. It never changes. The load on a circuit is easy to measure.

Cooling is the hardest. The capacity of cooling equipment changes with time. Capacity depends on how the equipment is operated, and it degrades over time. Even harder is the fact that cooling is distributed. Heat and air follow the paths of least resistance and don’t always go where you would expect. For these reasons and more, mission-critical facilities are designed for and built with far more cooling capacity than they need. And yet many operators add even more cooling each time there is a move, add, or change to IT equipment, because that’s been a safer bet than guessing wrong.

Here is a situation we frequently observe:

Operations will receive frequent requests to add or change IT loads as a normal course of business.  In large or multi-site facilities, these requests may occur daily.  Let’s say that operations receives a request to add 50 kW to a particular room.  Operations will typically add 70 kW of new cooling.

This provisioning is calculated assuming a full load for each server, with the full load being determined from server nameplate data.  In reality, it’s highly unlikely that all cabinets in a room will be fully loaded, and it is equally unlikely that the server will ever require its nameplate power.  And remember, the room was originally designed with excess cooling capacity.  When you add even more cooling to these rooms, you have escalated over-provisioning.  Capital and energy are wasted.

We find that cooling utilization is typically 35 to 40%, which leaves plenty of excess capacity for IT equipment expansions.  We also find that in 5-10% of situations, equipment performance and capacity has degraded to the point where cooling redundancy is compromised.  In these cases, maintenance becomes difficult and there is a greater risk of IT failure due to a thermal event. So, it’s important to know how a room is running before adding cooling.  But it isn’t always easy to tell if cooling units are not performing as designed and specified.

How can operations managers make more cost effective – and safe – planning decisions?  Analytics.

Analytics using real-time data provides managers with the insight to determine whether or not cooling infrastructure can handle a change or expansion to IT equipment, and to manage these changes while minimizing risk.  Specifically, analytics can quantify actual cooling capacity, expose equipment degradation, and reveal where there is more or less cooling reserve in a room for optimal placement of physical and virtual IT assets.

Consider the following analytics-driven capacity report.  Continually updated by a sensor network, the report clearly displays exactly where capacity is available and where it is not.  With this data alone, you can determine where capacity exists and where you can safely and immediately add capacity with no CapEx investment.  And, in those situations where you do need to add additional cooling, it will predict with high confidence what you need. (click on the image for a full-size version)

Cooling Capacity

Yet you can go deeper still.  By pairing the capacity report with a cooling reserve map (below), you can determine where you can safely place additional load in the desired room.  You can also see where you should locate your most critical assets and, when you need that new air conditioner, and where you should place it.

(click on the image for a full size version)thermalcircle

Using these reports, operations can:

  • avoid the CapEx cost of more cooling every time IT equipment is added;
  • avoid the risk of cooling construction in production data rooms when it is often not needed;
  • avoid the delayed time to revenue from adding cooling to a facility that doesn’t need it.

In addition, analytics used in this way avoids unnecessary energy and maintenance OpEx costs.

Stop guessing and start practicing the art of avoidance with analytics.

 

 

Maintenance is Risky

No real surprise here. Mission critical facilities that pride themselves on and/or are contractually obligated to provide the “five 9’s” of reliability know that sooner or later they must turn critical cooling equipment off to perform maintenance. And they know that they face risk each time they do so.

This is true even for the newest facilities. The minute a facility is turned up, or IT load is added, things start to change. The minute a brand new cooling unit is deployed, it starts to degrade – however incrementally. And that degree of degradation is different from unit to unit, even when those units are nominally identical.

In a risk and financial performance panel presentation at a recent data center event sponsored by Digital Realty, ebay’s Vice President of Global Foundation Services Dean Nelson recently stated that “touching equipment for maintenance increases Probability of Failure (PoF).” Nelson actively manages and focuses on reducing ebay’s PoF metric throughout the facilities he manages.

Performing maintenance puts most facility managers between the proverbial rock and a hard place. If equipment isn’t maintained, by definition you have a “run to failure” maintenance policy. If you do maintain equipment, you incur risk each time you turn something off. The telecom industry calls this “hands in the network” which they manage as a significant risk factor.

What if maintenance risks could be mitigated? What if you could predict what would happen to the thermal conditions of a room and, even more specifically, what racks or servers could be affected if you took a particular HVAC unit offline?

This ability is available today. It doesn’t require computational fluid dynamics (CFD) or other complicated tools that rely on physical models. It can be accomplished through data and analytics. That is, analytics continually updated by real-time data from sensors instrumented throughout a data center floor. Gartner Research says that hindsight based on historical data, followed by insight based on current trends, drives foresight.

Using predictive analytics, facility managers can also determine exactly which units to maintain and when – in addition to understanding the potential thermal affect that each maintenance action will have on every location in the data center floor.

If this knowledge was easily available, what facility manager wouldn’t choose to take advantage of it before taking a maintenance action? My next blog post will provide a visual example of the analysis facility managers can perform to determine when and where to perform maintenance while simultaneously reducing risk to more critical assets and the floor as a whole.

The Value of Efficiency-Aware Decision Making

My Chevy Volt displays my gas mileage.  In fact, I knew what the mileage performance would be before I bought the car. It was a factor in my purchase choice.

In addition to cars, most large appliances display power use along with Energy-Star certification. Residential air conditioners display standard energy efficiency ratings (SEER).   Even large commercial building air conditioners have to meet standard rating conditions for efficiency.

Yet, it is only recently that efficiency ratings have been specified for data center cooling.  The primary reason is that for years, manufacturers of cooling units for mission critical facilities avoided efficiency ratings requirements claiming that, because their products were used for process cooling versus comfort cooling, efficiency standards shouldn’t apply.  Fortunately, ASHRAE took up the charge and updated Standard 90.1 so that equipment covered by ASHRAE Standard 127 is required to meet minimum efficiency standards.  Standard 90.1 has been adopted by the Department of Energy as a federal energy standard and is now referenced by many code authorities.

While useful and certainly progress, the choice enabled by these two standards is just a start.  Certainly new equipment can and should be compared based on energy efficiency ratings.  However we all know that equipment efficiency will vary considerably through use. It would also be useful to be able to  view and compare the operational efficiency of existing equipment in order  to evaluate which machines are working well, which should be replaced (using the new equipment efficiency ratings as a baseline of comparison) –  and how much efficiency could be gained (and calculated from an ROI perspective) through replacement.

Some HVAC manufacturers have taken up this challenge. NTT, for example, provides the coefficient of performance for its computer room air conditioners in real time, viewable on the front panel of each unit and through a communications interface.  We commend them.

The ability to compare initial purchase energy efficiency ratings against actual performance over time for a particular machine, gives data center managers the ability to not only track and evaluate a machine for individual performance durability, and compare its performance with that of similar machines.  Mechanisms and procedures can be put in place for maintenance as degradation is spotted.   Inefficient machines can be used less, fixed or phased out.

We challenge mission critical cooling system manufacturers to pull back the veil of secrecy on energy efficiency.  The time for transparency is at hand because this information is knowable.  The combination of smart sensors and analytics technology can already report dynamic machine-to-machine efficiency as this information is required to drive cooling optimization.  The smart decision is for HVAC manufacturers to get out ahead of this data, and use efficiency reporting as a differentiator and means of driving continual improvement.

Just as mandatory EPA mileage ratings and rising gas prices changed consumer buying decisions – and drove car manufacturers to offer cars with better gas mileage, more granular energy performance ratings will improve the efficiency of cooling equipment.  And this benefits all of us.

More Cooling

More Cooling With Less $$

My last post took a look at the maintenance savings possible through more efficient data center/facility cooling management.  You can gain further savings by increasing the capacity of your existing air handling/ air conditioning units.  It is even possible to add IT load without requiring new air conditioners or at the least, deferring those purchases.  Here’s how.

Data centers and buildings have naturally occurring air stratification.  Many facilities deliver cool air from an under floor plenum.  As the air heats and rises, cooling air is delivered low and moved about with low velocity.  Because server racks sit on the floor, they sit in a colder area on average.  The air conditioners however, draw from higher in the room – capturing the hot air from above and delivering it, once cooled down, to the under floor plenum. This vertical stratification creates an opportunity to deliver cooler air to servers and at the same time increase cooling capacity by drawing return air from higher in the room.

However, this isn’t easy to achieve.  The problem is that uncoordinated or decentralized control of air conditioners often causes some of the units to deliver uncooled air into the under floor plenum. There, the mixing of cooled and uncooled air results in higher inlet air temperatures of servers, and ultimately lower return-air temperatures, which reduces the capacity of the cooling equipment.

A cooling management system can establish a colder profile at the bottom of the rack and make sure that each air conditioner is actually having a cooling effect, versus working ineffectively and actually increasing heat through its operation. An intelligent cooling energy management system dynamically right-sizes air conditioning unit capacity loads, coordinating their combined operation so that all the units deliver cool air and don’t mix hot return air from some units with cold air from other units. This unit-by-unit but combined coordination squeezes the maximum efficiency out of all available units so that, even at full load, inefficiency due to mixing is avoided and significant capacity-improving benefits are gained.

Consider this example.  At one company, their 40,000 sq. foot data center appeared to be out of cooling capacity.  After deploying an intelligent energy management system, not only did energy usage drop, but the company was able to increase its data center IT load by 40% without adding additional air conditioners and, in fact,  after de-commissioning two existing units. As well, the energy management system maintained proper, desired inlet air temperatures under this higher load condition.

Consider going smarter before moving to an additional equipment purchase decision.  Savings become even larger if you consider avoided maintenance costs for new equipment, and energy reduction through more efficiently balanced capacity loads, year-over-year.