Does Efficiency Matter?

Currently, it seems that lots of things matter more than energy efficiency. Investments in reliability, capacity expansion and revenue protection all receive higher priority in data centers than any investment focusing on cutting operating expenses through greater efficiency.

So does this mean that efficiency really doesn’t matter? Of course efficiency matters. Lawrence Berkeley National Labs just issued a data center energy report proving just how much efficiency improvements have slowed the data center industry’s energy consumption; saving a projected 620 billion kWh between 2010 and 2020.

The investment priority disconnect occurs when people view efficiency from the too narrow perspective of cutting back.

Efficiency, in fact, has transformational power – when viewed through a different lens.

Productivity is an area ripe for improvements specifically enabled by IoT and automation. Automation’s impact on productivity often gets downplayed by employees who believe automation is the first step toward job reductions. And sure, this happens. Automation will replace some jobs. But if you have experienced and talented people working on tasks that could be automated, your operational productivity is suffering. Those employees can and should be repurposed for work that’s more valuable. And, as most datacenters run with very lean staffing, your employees are already working under enormous pressure to keep operations working perfectly and without downtime. Productivity matters here as well. Making sure your employees are working on the right, highest impact activities generates direct returns in cost, facility reliability and job satisfaction.

Outsourcing is another target. Outsourcing maintenance operations has become common practice. Yet how often are third party services monitored for efficiency? Viewing the before and after performance of a room or a piece of equipment following maintenance is telling. These details, in context with operational data, can identify where you are over-spending on maintenance contracts or where dollars can be allocated elsewhere for higher benefit.

And then there is time. Bain and Company in a 2014 Harvard Business Review article called time “your scarcest resource,” and as such is a logical target for efficiency improvement.  Here’s an example. Quite often data center staff will automatically add cooling equipment to facilities to support new or additional IT load. A quick and deeper look into the right data often reveals that the facilities can handle the additional load immediately and without new equipment. A quick data dive can save months of procurement and deployment time, while simultaneously accelerating your time to the revenue generated by the additional IT load.

Every time employees can stop or reduce time spent on a low value activity, they can achieve results in a different area, faster. Conversely, every time you free up employee time for more creative or innovative endeavors, you have an opportunity to capture competitive advantage. According to a report by KPMG as cited by the Silicon Valley Beat, the tech sector is already focused on this concept, leveraging automation and machine learning for new revenue advantages as well as efficiency improvements.

“Tech CEOs see the benefits of digital labor augmenting workforce capabilities,” said Gary Matuszak, global and U.S. chair of KPMG’s Technology, Media and Telecommunications practice.

“The increased automation and machine learning could enable new ways for tech companies to conduct business so they can add customer value, become more efficient and slash costs.”

Investments in efficiency when viewed through the lens of “cutting back” will continue to receive low priority. However, efficiency projects focusing on productivity or time to revenue will pay off with immediate top line effect. They will uncover ways to simultaneously increase return on capital, improve workforce productivity, and accelerate new sources of revenue. And that’s where you need to put your money.

Data Center Capacity Planning – Why Keep Guessing?

Capacity management involves decisions about space, power, and cooling.

Space is the easiest. You can assess it by inspection.

Power is also fairly easy. The capacity of a circuit is knowable. It never changes. The load on a circuit is easy to measure.

Cooling is the hardest. The capacity of cooling equipment changes with time. Capacity depends on how the equipment is operated, and it degrades over time. Even harder is the fact that cooling is distributed. Heat and air follow the paths of least resistance and don’t always go where you would expect. For these reasons and more, mission-critical facilities are designed for and built with far more cooling capacity than they need. And yet many operators add even more cooling each time there is a move, add, or change to IT equipment, because that’s been a safer bet than guessing wrong.

Here is a situation we frequently observe:

Operations will receive frequent requests to add or change IT loads as a normal course of business.  In large or multi-site facilities, these requests may occur daily.  Let’s say that operations receives a request to add 50 kW to a particular room.  Operations will typically add 70 kW of new cooling.

This provisioning is calculated assuming a full load for each server, with the full load being determined from server nameplate data.  In reality, it’s highly unlikely that all cabinets in a room will be fully loaded, and it is equally unlikely that the server will ever require its nameplate power.  And remember, the room was originally designed with excess cooling capacity.  When you add even more cooling to these rooms, you have escalated over-provisioning.  Capital and energy are wasted.

We find that cooling utilization is typically 35 to 40%, which leaves plenty of excess capacity for IT equipment expansions.  We also find that in 5-10% of situations, equipment performance and capacity has degraded to the point where cooling redundancy is compromised.  In these cases, maintenance becomes difficult and there is a greater risk of IT failure due to a thermal event. So, it’s important to know how a room is running before adding cooling.  But it isn’t always easy to tell if cooling units are not performing as designed and specified.

How can operations managers make more cost effective – and safe – planning decisions?  Analytics.

Analytics using real-time data provides managers with the insight to determine whether or not cooling infrastructure can handle a change or expansion to IT equipment, and to manage these changes while minimizing risk.  Specifically, analytics can quantify actual cooling capacity, expose equipment degradation, and reveal where there is more or less cooling reserve in a room for optimal placement of physical and virtual IT assets.

Consider the following analytics-driven capacity report.  Continually updated by a sensor network, the report clearly displays exactly where capacity is available and where it is not.  With this data alone, you can determine where capacity exists and where you can safely and immediately add capacity with no CapEx investment.  And, in those situations where you do need to add additional cooling, it will predict with high confidence what you need. (click on the image for a full-size version)

Cooling Capacity

Yet you can go deeper still.  By pairing the capacity report with a cooling reserve map (below), you can determine where you can safely place additional load in the desired room.  You can also see where you should locate your most critical assets and, when you need that new air conditioner, and where you should place it.

(click on the image for a full size version)thermalcircle

Using these reports, operations can:

  • avoid the CapEx cost of more cooling every time IT equipment is added;
  • avoid the risk of cooling construction in production data rooms when it is often not needed;
  • avoid the delayed time to revenue from adding cooling to a facility that doesn’t need it.

In addition, analytics used in this way avoids unnecessary energy and maintenance OpEx costs.

Stop guessing and start practicing the art of avoidance with analytics.

 

 

Predictive Analytics & Data Centers: A Technology Whose Time Has Come

Back in 1993, ASHRAE organized a competition called the “Great Energy Predictor Shootout,” a competition designed to evaluate various analytical methods used to predict energy usage in buildings.  Five of the top six entries used artificial neural networks.  ASHRAE organized a second energy predictor shootout in 1994, and this time the winners included a balance of neural networks and non-linear regression approaches to prediction and machine learning.  And yet, as successful as the case studies were, there was little to no adoption of this compelling technology.

Fast forward to 2014 when Google announced its use of machine learning leveraging neural networks to “optimize data center operations and drive…energy use to new lows.”  Google uses neural networks to predict power usage effectiveness (PUE) as a function of exogenous variables such as outdoor temperature, and operating variables such as pump speed. Microsoft too has stepped up to endorse the significance of machine learning for more effective prediction analysis.  Joseph Sirosh, corporate vice president at Microsoft, says:  “traditional analysis lets you predict the future. Machine learning lets you change the future.”  And this recent article advocates the use of predictive analytics for the power industry.

The Vigilent system also embraces this thinking, and uses machine learning as an integral part of its control software.  Specifically, Vigilent uses continuous machine learning to ensure that predictions driving cooling control decisions remain accurate over time, even as conditions change (see my May 2013 blog for more details).  Vigilent predictive analysis continually informs the software of the likely result of any particular control decision, which in turn allows the software to extinguish hot spots – and most effectively optimize cooling operations with desired parameters to the extent that data center design, layout and physical configuration will allow.

This is where additional analysis tools, such as the Vigilent Influence Map™, become useful.  The Influence Map provides a current, real-time and highly visual display of which cooling units are cooling which parts of the data floor.

As an example, one of our customers saw that he had a hot spot in a particular area that hadn’t been automatically corrected by Vigilent.  He reviewed his Vigilent Influence Map and saw that the three cooling units closest to the hot spot had little or no influence on the hot spot.  The Influence Map showed that cooling units located much farther away were providing some cooling to the problem area.  Armed with this information, he investigated the cooling infrastructure near the hot spot and found that dampers in the supply ductwork from the three closest units were closed.  Opening them resolved the hot spot.  The influence map provided insight that helped an experienced data center professional more quickly identify and resolve his problem and ensure high reliability of the data center.

Operating a data center without predictive analytics is like driving a car facing backwards.  All you can see is where you’ve been and where you are right now.  Driving a car facing backwards is dangerous.   Why would anyone “drive” their data center in this way?

Predictive analytics are available, proven and endorsed by technology’s most respected organizations.  This is a technology whose time has not only come, but is critical to the reliability of increasingly complex data center operations.

IMG_7525_cliff250

A Look at 2013

We grew!

We moved!

We’ve had a heck of a year!

In 2013 alone, we reduced (and avoided the generation of) more than 85 thousand tons of carbon emissions from the atmosphere.

This is a statistic of which I am very, very proud and one that clearly demonstrates the double bottom line impact of the Vigilent solution.

We have directly impacted the planet by reducing energy requirements and CO2 emissions, even as the demands of our digital lifestyles increase.  We have impacted individual quality of life by increasing uptime reliability and contributing to the safety of treasured documents and photos, as well as helping to ensure the uninterrupted transmission of information that makes our world operate.  We are honored and privileged to contribute so directly to the well-being of our world and our customers.

While analysts have cited a DCIM market contraction in 2013, Vigilent has thrived.   We attracted new customers and engendered even deeper loyalty among existing customers – evidenced by our organic growth as one deployment turns into 3, then ten, then dozens across the United States when actual energy savings and thermal condition insights are realized.

I am pleased to share some of the milestones we achieved in 2013:

We moved to terrific new facilities in uptown Oakland.  Not only does our new facility (within a literally green building)  provide us with space for in-house product commissioning and expanded R&D,  it provides a vibrant collaborative atmosphere for employees.  The new location is adjacent to public transportation, honoring our commitment to a green corporate culture, and offers dozens of great restaurants, coffee shops and diverse entertainment options for employees.

We grew – in revenues, in customer base, into new markets and with staff.  With growth comes responsibility to provide more directed  leadership in business functions and market focus.  With this in mind, we expanded our executive management staff, hiring  Dave Hudson to oversee sales and operations worldwide, and  Alex Fielding to introduce Vigilent to federal markets and many new field engineers, software engineers, QA and support staff.

We expanded our product offering with new functionality including out-of-the-box reports that help with energy savings, SLA adherence, maintenance and capacity planning.  We continued to refine our trademark intelligence and control functionality enhancing both usability and energy savings in ever more complex data center environments – achieving an additional 30% savings in some cases.

Ultimately, all of this helps our customers succeed not only in direct bottom line impact, but with large-scale sustainability efforts that are widely recognized.  Avnet used the Vigilent system in corporate sustainability initiatives that garnered the company the Uptime Institute GEIT award, as well as recognition by InfoWorld as a top Green IT award winner.    Our sales partner, NTT Facilities, continues to roll out  Vigilent deployments in Japan.

Our ability to contribute to the Federal Government’s initiative to consolidate data centers and reduce overall energy savings is significant indeed.  Watch this space.

With a great year behind us, we recognize that there is much to do, as the data center industry – at last – is realizing how significantly data and analytics can improve day to day operations and efficiency endeavors.

The Emerson-Poneman Institute recently issued a study on Data Center outages that states accidental human error remains in the top-3 cited reasons for downtime and that 52% of survey respondents believe these accidents could have been prevented.

Intelligent software control and analytics will help operators make better,  more informed decisions and reduce such human errors.   These tools will increasingly help data centers proactively avoid trouble, while at the same time helping them diagnose and resolve actual issues more quickly.

This will be the year of analytics for data centers.  Vigilent is equipped and prepared to lead this charge, leveraging years of institutional knowledge we have gleaned  from hundreds of deployments in every conceivable configuration in mission critical facilities on four continents.  This mass of data influences the analytics we use to engage individual control decisions at every site, and also, more recently, places the benefit of this accumulated knowledge into the hands and minds of data center managers for more informed process management.

Happy New Year.

The Value of Efficiency-Aware Decision Making

My Chevy Volt displays my gas mileage.  In fact, I knew what the mileage performance would be before I bought the car. It was a factor in my purchase choice.

In addition to cars, most large appliances display power use along with Energy-Star certification. Residential air conditioners display standard energy efficiency ratings (SEER).   Even large commercial building air conditioners have to meet standard rating conditions for efficiency.

Yet, it is only recently that efficiency ratings have been specified for data center cooling.  The primary reason is that for years, manufacturers of cooling units for mission critical facilities avoided efficiency ratings requirements claiming that, because their products were used for process cooling versus comfort cooling, efficiency standards shouldn’t apply.  Fortunately, ASHRAE took up the charge and updated Standard 90.1 so that equipment covered by ASHRAE Standard 127 is required to meet minimum efficiency standards.  Standard 90.1 has been adopted by the Department of Energy as a federal energy standard and is now referenced by many code authorities.

While useful and certainly progress, the choice enabled by these two standards is just a start.  Certainly new equipment can and should be compared based on energy efficiency ratings.  However we all know that equipment efficiency will vary considerably through use. It would also be useful to be able to  view and compare the operational efficiency of existing equipment in order  to evaluate which machines are working well, which should be replaced (using the new equipment efficiency ratings as a baseline of comparison) –  and how much efficiency could be gained (and calculated from an ROI perspective) through replacement.

Some HVAC manufacturers have taken up this challenge. NTT, for example, provides the coefficient of performance for its computer room air conditioners in real time, viewable on the front panel of each unit and through a communications interface.  We commend them.

The ability to compare initial purchase energy efficiency ratings against actual performance over time for a particular machine, gives data center managers the ability to not only track and evaluate a machine for individual performance durability, and compare its performance with that of similar machines.  Mechanisms and procedures can be put in place for maintenance as degradation is spotted.   Inefficient machines can be used less, fixed or phased out.

We challenge mission critical cooling system manufacturers to pull back the veil of secrecy on energy efficiency.  The time for transparency is at hand because this information is knowable.  The combination of smart sensors and analytics technology can already report dynamic machine-to-machine efficiency as this information is required to drive cooling optimization.  The smart decision is for HVAC manufacturers to get out ahead of this data, and use efficiency reporting as a differentiator and means of driving continual improvement.

Just as mandatory EPA mileage ratings and rising gas prices changed consumer buying decisions – and drove car manufacturers to offer cars with better gas mileage, more granular energy performance ratings will improve the efficiency of cooling equipment.  And this benefits all of us.

Cooling Doesn’t Manage Itself

Cooling Doesn’t Manage Itself

Of the primary components driving data center operations – IT assets, power, space and cooling – the first three command the lion’s share of attention.  Schneider Electric (StruxureWare), Panduit (PIM), ABB (Decathalon), Nlyte, Emerson (Trellis) and others have created superb asset and power tracking systems.   Using systems like these and others, companies can get a good idea as to where their assets are located, how to get power to them and even how to optimally manage them under changing conditions.

Less well understood and, I would argue, not understood at all, is how to get all the IT-generated heat out of the data center, and as efficiently as possible.

Some believe that efficient cooling can be “designed in,” as opposed to operationally managed, and that this is good enough.

On the day a new data center goes live the cooling will, no doubt, operate superbly.  That is, right up until something changes – which could happen the next day, weeks or months later.  Even the most efficiently designed data centers eventually operate inefficiently. At that point, your assets are at risk and you probably won’t even know it.  Changes and follow-in inefficiencies are inevitable.

As well, efficiency by design only applies to new data centers.  The vast majority of data centers operating today are aging. All of them have degraded with incremental cooling issues over time.   IT changes, infrastructure updates, failures, essentially any and all physical data center changes or incidents, affect cooling in ways that may not be detected through traditional operations or “walk around” management.

Data center managers must manage their cooling infrastructure as dynamically and closely as they do their IT assets.  The health of the cooling system directly impacts the health of those very same IT assets.

Further, cooling must be managed operationally.  Beyond the cost savings of continually optimized efficiency, cooling management systems provide clearer insight into where to add capacity, redundancy, potential thermal problems, and areas of risk.

Data centers have grown beyond the point where they can be managed manually.  It’s time stop treating cooling as the red-headed step-child of data centers.  Cooling requires the same attention and sophisticated management systems that are in common use for IT assets.  There’s no time to lose.