Energy-Management Innovations Redefine Disaster Recovery Practices
The economic recovery is underway, albeit at a slow pace, and many regions are also recovering from the impacts of major natural disasters. As we all try to get back to “normal,” it makes sense to reflect on the new definition of normal in the data center. After almost a decade of cutbacks as well as hyper-focus on efficiencies, productivity and all of the cost components that fall under capex and opex, it would be wise to review what we have learned.
In particular, energy-management advancements have led to the evolution of three new categories of data center best practices: disaster recovery (DR) power, power capping and high ambient temperature. This series will cover each of these best practices, starting with the present article about energy-efficient DR practices. (Also see Part 2, “Power Capping Puts IT Back in Control.”
Disaster Recovery: More Than a Luxury
Many data center disaster recovery solutions were once considered luxuries and were typically budgeted for only the most mission-critical business functions. Today’s global markets and Internet-centric communications and collaborations, however, have made it more imperative that businesses remain connected and operational even during times of major outages or natural disasters.
Even a relatively short outage—without any DR solution in place—can be financially crippling. The daily transactions for a large enterprise can add up to millions of dollars of revenue in less than a week. Smaller businesses, while less affected in terms of daily loss, are actually more vulnerable. The loss of any orders or customers can be catastrophic to smaller-scale companies that lack the substantial reserves of fiscally sound, large-scale enterprises.
A DR solution, therefore, has become a requirement for any company that cannot survive a business disruption in excess of a day or two. Colocation and cloud-computing services have simplified DR planning and offer more-affordable alternatives than dedicated remote data centers. Uninterruptible power supplies (UPSs) and generators can be fairly easily deployed to ensure near-instant switchover to standby power sources.
Energy-Efficient Disaster Recovery Practices
The magnitude 9.0 earthquake that hit Japan in 2011, Superstorm Sandy pounding the east coast of the U.S. in 2012 and numerous other disasters before and since have taught us some very valuable lessons. Disasters highlight the need for proactive power management in the data center.
Only with an accurate understanding and control of the baseline power consumption under normal circumstances can power be effectively allocated during times of crisis and power loss. Data center technology providers have responded to this need for baseline power management with a variety of new tools. As a result, IT managers can, at a minimum, easily examine the returned-air temperature at the air-conditioning (AC) units and the power consumption for each rack in the data center.
Even better, data center managers can adopt energy-management best practices that apply a holistic energy- and cooling-management solution. The latest innovations in this area provide fine-grained levels of monitoring focused on server inlet temperatures. Middleware is available to aggregate server inlet temperatures as well as real-time power-consumption characteristics for servers, blades, power-distribution units (PDUs), UPSs and other data center equipment.
Aggregated thermal and power data can be combined with return-air temperature at the AC units to generate thermal and energy maps of the data center. Compared with earlier power-management approaches based on modeling and estimations, the holistic practices yield extremely accurate views of the data center based on actual power usage data.
The practices of analyzing power behaviors by rack, row, or room and maintaining historical logs of power usage as it relates to service- and server-activity levels prepare data center managers to effectively identify key resources that need prioritization during outages. Furthermore, the process and resulting assets lead the data center manager in developing a comprehensive business-continuity plan with tools and skill sets to allocate available power in the event of a partial power outage, equipment failure or even full-scale disaster.
Before any disaster, holistic power-management solutions can identify the biggest energy consumers in the data center. These might include groups of employees or certain applications, or they might point to inefficient or outdated hardware that should be considered candidates for refreshes.
Use-Case Examples: Energy Controls Lower Consumption
A wealth of data is available on the subject of energy efficiency in the data center; much of this data is driving energy-efficient DR best practices. Organizations such as the Open Data Center Alliance (ODCA) provide open forums for idea sharing and discussions of research results driving progress in this area. And many companies publish use-case reports that underscore the huge potential for significant energy reductions through effective management practices.
A recently published paper about NTT Data Corporation (NTT Data), for example, shares the results of that company’s in-house investigations of power-management practices. After the 2011 Japan earthquake disaster, the company faced shortages of electrical power owing to nuclear power plants that had been shut down and to subsequent usage restrictions put in place by the Japanese government.
Research was aimed at reducing NTT Data’s peak power consumption by 10 percent and also at extending operating time of its servers during power outages. NTT Data’s data center team discovered a power-management solution that could yield an 18 percent reduction in power use by reducing high-load server performance by 10 percent. They also determined that it was possible to achieve almost the same results (16 percent reduction) by reducing the performance of low-load servers by 30 percent. During outages, these power reductions translate to a 1.8x improvement in operating time by limiting power to each server.
At BMW, a similar proof of concept was documented. This study was aimed at determining how much power could be conserved without affecting the performance of critical servers. BMW Group discovered that it was possible to lower server power consumption by 18 percent and increase server efficiency by approximately 19 percent, facts that the company used to achieve server power savings, increase rack utilization and lower overall data center power consumption.
Both of these studies effectively validate power-management practices that support energy-efficient DR solutions, as well as cost-effective, ongoing energy management.
The headlines routinely remind us that disasters can strike anytime, anywhere and with much longer-lasting impact that we want to believe. Fortunately, the evolution of energy-efficient disaster recovery practices and holistic energy-management solutions can be applied to a broad range of infrastructure models that give data center managers the ability to tailor a solution to each site. The benefits not only support a strong business case for DR solutions, but they also extend to ongoing cost savings by driving up data center energy efficiencies.
Looking to the future, energy-management solutions offer a much better prognosis even as server sprawl and energy costs threaten to wreak havoc on data center budgets. Armed with detailed knowledge about energy consumption, data center managers and facilities teams can better plan data center build-outs, avoid power spikes that can damage equipment and shorten life spans, and adjust workloads across servers, racks, rows or sites according to energy prices and availability.
In follow-on articles, we will explore the energy-management best practices relating to server power capping as well as operating data centers at high ambient temperatures.
Leading article photo courtesy of IntelFreePress
About the Author
Jeff Klaus is the general manager of Data Center Manager (DCM) Solutions at Intel Corporation, where he has managed various groups for over 13 years. Klaus’s team is pioneering power- and thermal-management middleware, which is sold through an ecosystem of data center infrastructure management (DCIM) software companies and OEMs. A graduate of Boston College, Klaus also holds an MBA from Boston University. He can be reached at Jeffrey.S.Klaus@intel.com.
 NTT Data case study: “Dynamically Controlling Server Power Consumption and Reducing Data Center Peak Usage by 16 to 18 Percent: Securing Data Center Business Continuity during Power Outages,”
 BMW Group case study: “Preserving Performance While Saving Power Using Intel Intelligent Power Node Manager and Intel Data Center Manager,“