A Look at 2014

In 2014 we leveraged the significant company, market and customer expansion we achieved in 2013 to focus on strategic partnerships.  Our goal was to significantly increase our global footprint with the considerable resources and vision of these industry leaders.  We have achieved that goal and more.

Together with our long-standing partner NTT Facilities, we continue to add power and agility to complementary data center product lines managed by NTT in pan-Asia deployments.  In partnership with Schneider Electric, we are proud to announce the integration of Vigilent dynamic cooling management technology into the Cooling Optimize module of Schneider Electric’s industry-leading DCIM suite, StruxureWare for Data Centers.

Beyond the technical StruxureWare integration, Vigilent has also worked closely with Schneider Electric to train hundreds of Schneider Electric sales and field operations professionals in preparation for the worldwide roll-out of Cooling Optimize.  Schneider Electric’s faith in us has already proven well-founded as deployments are already underway across multiple continents.  With the reach of Schneider Electric’s global sales and marketing operations, their self-described “Big Green Machine,” and NTT Facilities’ expanding traction in and outside of Japan, we anticipate a banner year.

As an early adopter of machine learning, Vigilent has been recognized as a pioneer of the Internet of Things (IoT) for energy.  Data collected over seven years from hundreds of deployments continually informs and improves Vigilent system performance.  The analytics we have developed provide unprecedented visibility into data center operations and are driving the introduction of new Vigilent capabilities.

Business success aside, our positive impact on the world continues to grow.  In late 2014, we announced that Vigilent systems have reduced energy consumption by more than half a billion kilowatt hours and eliminated more than 351,000 tons of CO2 emissions.  These figures are persistent and grow with each new deployment.

We are proud to see our customers turn pilot projects into multiple deployments as the energy savings and data center operational benefits of the system prove themselves over and over again.  This organic growth is testimony to the consistency of the Vigilent product’s operation in widely varying mission critical environments.

Stay tuned to watch this process repeat itself as we add new Fortune 50 logos to our customer base in 2015.  We applaud the growing sophistication of the data center industry as it struggles with the dual challenges of explosive growth and environmental stewardship and remain thankful for our part in that process.

 

Predictive Analytics & Data Centers: A Technology Whose Time Has Come

Back in 1993, ASHRAE organized a competition called the “Great Energy Predictor Shootout,” a competition designed to evaluate various analytical methods used to predict energy usage in buildings.  Five of the top six entries used artificial neural networks.  ASHRAE organized a second energy predictor shootout in 1994, and this time the winners included a balance of neural networks and non-linear regression approaches to prediction and machine learning.  And yet, as successful as the case studies were, there was little to no adoption of this compelling technology.

Fast forward to 2014 when Google announced its use of machine learning leveraging neural networks to “optimize data center operations and drive…energy use to new lows.”  Google uses neural networks to predict power usage effectiveness (PUE) as a function of exogenous variables such as outdoor temperature, and operating variables such as pump speed. Microsoft too has stepped up to endorse the significance of machine learning for more effective prediction analysis.  Joseph Sirosh, corporate vice president at Microsoft, says:  “traditional analysis lets you predict the future. Machine learning lets you change the future.”  And this recent article advocates the use of predictive analytics for the power industry.

The Vigilent system also embraces this thinking, and uses machine learning as an integral part of its control software.  Specifically, Vigilent uses continuous machine learning to ensure that predictions driving cooling control decisions remain accurate over time, even as conditions change (see my May 2013 blog for more details).  Vigilent predictive analysis continually informs the software of the likely result of any particular control decision, which in turn allows the software to extinguish hot spots – and most effectively optimize cooling operations with desired parameters to the extent that data center design, layout and physical configuration will allow.

This is where additional analysis tools, such as the Vigilent Influence Map™, become useful.  The Influence Map provides a current, real-time and highly visual display of which cooling units are cooling which parts of the data floor.

As an example, one of our customers saw that he had a hot spot in a particular area that hadn’t been automatically corrected by Vigilent.  He reviewed his Vigilent Influence Map and saw that the three cooling units closest to the hot spot had little or no influence on the hot spot.  The Influence Map showed that cooling units located much farther away were providing some cooling to the problem area.  Armed with this information, he investigated the cooling infrastructure near the hot spot and found that dampers in the supply ductwork from the three closest units were closed.  Opening them resolved the hot spot.  The influence map provided insight that helped an experienced data center professional more quickly identify and resolve his problem and ensure high reliability of the data center.

Operating a data center without predictive analytics is like driving a car facing backwards.  All you can see is where you’ve been and where you are right now.  Driving a car facing backwards is dangerous.   Why would anyone “drive” their data center in this way?

Predictive analytics are available, proven and endorsed by technology’s most respected organizations.  This is a technology whose time has not only come, but is critical to the reliability of increasingly complex data center operations.

IMG_7525_cliff250

Intelligent Efficiency

Intelligent Efficiency, The Next New Thing.

Greentech Media’s senior editor Stephen Lacey reported that the convergence of the internet and distributed energy are contributing to a new economic paradigm for the 21st century.

Intelligent efficiency is the next new thing enabled by that paradigm, he says, in a special report  of the same name.  He also notes that this isn’t the “stale, conservation-based energy efficiency Americans often think about.”  He says that the new thinking around energy efficiency is information-driven.  It is granular. And it empowers consumers and businesses to turn energy from a cost into an asset.

I couldn’t agree more.

Consider how this contrast in thinking alone generates possibilities for resources that have been hidden or economically unavailable until now.

Conservation-based thinking or, as I think about it in data centers, “efficiency by design or replacement,” is capital intensive.  To date, this thinking has been focused on new construction, physical infrastructure change, or equipment swap-outs.  These efforts are slow and can’t take advantage of operational variations such as the time-varying costs of energy.

Intelligent energy efficiency thinking, on the other hand, leverages newly available information enabled by networked devices and wireless sensors  to make changes primarily through software.  Intelligent energy management is non-disruptive and easier to implement.  It reduces risk by offering greater transparency.   And, most importantly, it is fast.  Obstacles to the speed of implementation – and the welcome results of improved efficiency – have been removed by technology.

Intelligence is the key factor here.  You can have an efficient system, an efficient design, but if it isn’t operated effectively, it is inherently inefficient.  For example, you may deploy one perfectly efficient machine right next to another perfectly efficient machine believing that you have installed a state-of-the-art solution.  In reality, it’s more likely that these two machines are interacting and fighting with each other – at significant energy cost.   You also need to factor in and be able to track equipment degradation as well as the risks incurred by equipment swap-outs.

You need the third element – intelligence – working in tandem with efficient equipment, to make sure that the whole system works at peak level and continues to work at peak level, regardless of the operating conditions.  This information flow must be constant.  Even the newest, most perfectly optimized data centers will inevitably change.

Kudos to Greentech Media for this outstanding white paper and for highlighting how this new thinking and the” blending of real-time communications with physical systems”  is changing the game for energy efficiency.

Cooling Doesn’t Manage Itself

Cooling Doesn’t Manage Itself

Of the primary components driving data center operations – IT assets, power, space and cooling – the first three command the lion’s share of attention.  Schneider Electric (StruxureWare), Panduit (PIM), ABB (Decathalon), Nlyte, Emerson (Trellis) and others have created superb asset and power tracking systems.   Using systems like these and others, companies can get a good idea as to where their assets are located, how to get power to them and even how to optimally manage them under changing conditions.

Less well understood and, I would argue, not understood at all, is how to get all the IT-generated heat out of the data center, and as efficiently as possible.

Some believe that efficient cooling can be “designed in,” as opposed to operationally managed, and that this is good enough.

On the day a new data center goes live the cooling will, no doubt, operate superbly.  That is, right up until something changes – which could happen the next day, weeks or months later.  Even the most efficiently designed data centers eventually operate inefficiently. At that point, your assets are at risk and you probably won’t even know it.  Changes and follow-in inefficiencies are inevitable.

As well, efficiency by design only applies to new data centers.  The vast majority of data centers operating today are aging. All of them have degraded with incremental cooling issues over time.   IT changes, infrastructure updates, failures, essentially any and all physical data center changes or incidents, affect cooling in ways that may not be detected through traditional operations or “walk around” management.

Data center managers must manage their cooling infrastructure as dynamically and closely as they do their IT assets.  The health of the cooling system directly impacts the health of those very same IT assets.

Further, cooling must be managed operationally.  Beyond the cost savings of continually optimized efficiency, cooling management systems provide clearer insight into where to add capacity, redundancy, potential thermal problems, and areas of risk.

Data centers have grown beyond the point where they can be managed manually.  It’s time stop treating cooling as the red-headed step-child of data centers.  Cooling requires the same attention and sophisticated management systems that are in common use for IT assets.  There’s no time to lose.

Machine Learning

Why Machine Learning-based DCIM Systems Are Becoming Best Practice.

Here’s a conundrum.  While data center IT equipment has a lifespan of about three years, data center cooling equipment will endure about 15 years. In other words,  your data center will likely  undergo five complete IT refreshes within the lifetime of your cooling equipment – at the very least.  In reality, refreshes happen much more frequently. Racks and servers come and go, floor tiles are moved, maintenance is performed, density is changed based on containment operations – any one of which will affect the ability of the cooling system to work efficiently and effectively.

If nothing is done to re-configure cooling operations as IT changes are made, and this is typically the case, the data center develops hot and cold spots, stranded cooling capacity and wasted energy consumption.  There is also risk with every equipment refresh – particularly if the work is done manually.

There’s a better way. The ubiquitous availability of low cost sensors, in tandem with the emerging availability of machine learning technology, is leading to development of new best practices for data center cooling management. Sensor-driven machine learning software enables the impact of IT changes on cooling performance to be anticipated and more safely managed.

Data centers instrumented with sensors gather real-time data which can inform software of minute-by-minute cooling capacity changes.  Machine learning software uses this information to understand the influence of each and every cooling unit, on each and every rack, in real-time as IT loads change.  And when loads or IT infrastructure changes, the software re-learns accordingly and updates itself, ensuring that the accuracy of its influence predictions remains current and accurate.   This ability to understand cooling influence at a granular level also enables the software to learn which cooling units are working effectively – and at expected performance levels  – and which aren’t.

This understanding also illuminates, in a data-supported way, the need for targeted corrective maintenance. With a clearer understanding and visualization of cooling unit health, operators can justify the right budget to maintain equipment effectively thereby improving the overall health and reducing risk in the data center.

In one recent experience at a large US data center, machine learning software revealed that 40% of the cooling units were consuming power but not cooling.  The data center operator was aware of the problem, but couldn’t convince senior management to expend budget because he couldn’t quantify the problem nor prove the value/need for a specific expenditure to resolve the issue.  With new and clear data in hand, the operator was able to identify the failed CRACs and present the appropriate budget required to fix and replace them accordingly.

This ability to more clearly see the impact of IT changes on cooling equipment enables personnel to keep up with cooling capacity adjustment and, in most cases, eliminate the need for manual control.  A reduction of the corresponding “on-the-fly, floor time corrections” also frees up operators to focus on problems that require more creativity and to more effectively manage physical changes such floor tile adjustments, etc.

There’s no replacement for experience-based human expertise. However, why not leverage your staff  to do what they do best, and eliminate those tasks which are better served by software control.  Data centers using machine learning software are undeniably more efficient and more robust.  Operators can more confidently future proof themselves against inefficiency or adverse capacity impact as conditions change.  For these reasons alone, use of machine learning-based software should be considered an emerging best practice.

Data Center Risk

Surprising Areas of Data Center Risk and How to Proactively Manage Them

Mission critical facilities need a different level of scrutiny and control over cooling management.

It’s no surprise that cooling is critical to the security of these facilities.  With requirements for 99.999 uptime and multimillion dollar facilities at risk, cooling is often the thin blue line between data safety and disaster.

And yet, many mission critical facilities use cooling control systems that were designed for comfort cooling, versus the reliable operation of hugely valuable and sensitive equipment.

When people get warm, they become uncomfortable. When IT equipment overheats, it fails – often with catastrophically expensive results.

In one recent scenario, a 6-minute chiller plant failure resulted in lost revenue and penalties totaling $14 million.  In another scenario, the failure of a single CRAC unit caused temperatures to shoot up to over 100 degrees Fehrenheit in a particular zone, resulting in the failure of a storage array.

These failures result from a myriad of complex, and usually unrealized risk areas.  My recent talk at the i4Energy Seminar series hosted by the California Institute for Energy and Environment (CIEE) exposes some of these hidden risk areas and what you can do about them.

You can watch that talk here: