Data center professionals are optimists by nature but pessimists by training, which is exactly as it should be when you are in the business of keeping very complicated systems up and running despite a million things that can go wrong. Our goal is to keep these mission-critical facilities fully operational with no downtime, and we arrive at work every day optimistic and determined that we will accomplish that impossible objective. But at the same time, we are forever braced for the worst to happen and always planning contingencies to preempt or mitigate those worst-case scenarios.
To put it simply, we confidently expect sunshine, while feverishly planning for the Storm of the Century, both figuratively and literally. That’s our job in a nutshell, so it’s no wonder we are a quirky bunch with those two mindsets fighting for room inside each of our heads.
Luckily, the truly worst-case scenarios that we think about, create contingency plans for, build redundant systems for and conduct live drills for almost never happen. Until they do. That is when all of our worrying and contingency planning gets put to the test. Hurricane Sandy last year was one of those worst-case scenarios. When it made landfall in New Jersey in October 2012, the winds and record storm surge combined to devastating effect, and the damage was so severe that residents of coastal neighborhoods are still trying to recover.* For the data center industry, the storm had an enormous impact that continues to reverberate many months later. At a recent data center event in New York where I spoke on a panel, Sandy was by far the dominant topic. Officially, the storm was only the focus of one panel session, but unofficially it was everywhere.
A lot of the discussion naturally began with personal stories about the impact of the storm, but it inevitably turned to what worked and what didn’t work when the water started rising and contingency plans designed for hypothetical emergencies were pitted against the reality of a natural disaster. There were a lot of successes to discuss, but there was also a lot of candidness about what did not work. The experience of Sandy gives us the rare opportunity to see our contingency plans put to the test in a living laboratory, and I feel strongly that there should be a healthy industry dialogue about that to help prepare us all for the next worst-case scenario. To help contribute to that dialogue, this column outlines three key lessons that I have learned through dialogues with fellow data center professionals since the flood waters receded.
Is Your Fuel Supply Chain Ready?
One of the biggest lessons from Sandy is something that anyone who has ever seen the movie Mad Max knows quite well: in a crisis, gasoline becomes very scarce very quickly. Data center disaster plans put a big focus on backup generators, and all the engineering and technology in those systems performed very well by all accounts…until the diesel started running out. Generators are not much use without fuel, and the storm revealed that many organizations did not put enough focus on their fuel supply chain.
Gas was in short supply for a lengthy period of time after the storm, which created problems for many facilities. Even organizations that did have access to fuel ran into transportation problems that prevented fuel from being delivered when they needed it. They could procure it; they just couldn’t get their hands on it because delivery routes were shut down. This is an important issue for our industry to address because these same issues are likely to occur with other scenarios such as earthquakes and hurricanes/tornadoes that cause significant damage to civil infrastructure. A key lesson learned from the Sandy fallout is that we as an industry need to devote more attention to fuel on hand, redundancy in the supply chain, supply geography and alternate transportation routes.
What’s in the Fine Print of Your Contracts?
In the aftermath of Sandy, a lot of companies found out the hard way about an often overlooked part of data center contracts: the force majure clause. In common parlance, it is better known as the “act of God” clause, which excuses a data center operator from the terms of a client contract if something beyond their control makes it impossible for them to meet their obligations. Anyone would consider a superstorm that puts lower Manhattan and large portions of Staten Island and New Jersey under water a situation that meets the terms of a force majure clause, but some data center companies learned that their client contracts had weak or nonexistent act of God sections that didn’t protect them properly.
Data center clients, on the other hand, bemoaned the force majure clauses in their contracts when the act of God actually affected their business. No one ever thinks an act of God will occur, so force majure sections are often overlooked as standard contract language. With the experience of Sandy behind us, it is a clause sure to receive more focus from both sides on future data center contract negotiations so that both providers and customers of data center services fully understand this contractual language and feel protected in the case of a natural disaster.
Do You Have a Plan for Your People, Not Just Your Data Center Technology?
Data center contingency plans tend to focus first and foremost on systems and technology, but they often lack enough foresight about the people challenges that occur during a disaster like Sandy. Because of the scope of Sandy’s damage, organizations couldn’t just focus on technology. They also needed to be able to support their employees, who were coping with the impact on their homes and families at the same time they were trying to fulfill their duties at work.
Contingency plans often have a static view of the role personnel will play in responding to an emergency: they are often built with the assumption that personnel will be there when needed and won’t have divided attention, just like it’s a regular day of work. But the reality is that a storm like Sandy leaves employees struggling with personal concerns alongside their work duties. That is a reality that contingency plans must do a better job of anticipating and responding to. There were also many very practical people issues that organizations need to prepare for: The storm trapped people at work for extended periods of time, prevented fresh staff from coming to work, made key people unavailable, forced people to sleep at work, led to difficulties feeding workers on site and much more. Many organizations had to create special contingency plans on the fly just to address the needs of their people, including special arrangements for transportation, food, rest, support for the non-work responsibilities that they were juggling and so on.
Unlike redundant equipment dedicated to the single purpose of protecting the data center with failover capabilities, the human factor is much trickier. Clearly redundancy is required so that no single person is critical to operations, but when regions of homes and people are affected by a disaster, N+1 on the personnel front may not be enough. The value of cross training for a broader set of roles becomes clear when a disaster like Sandy made it difficult for large swaths of employees to get to the data center.
This is by no means a definitive discussion of lessons learned from Sandy, but hopefully it is a helpful addition to the ongoing dialogue about how our industry can prepare for the next situation that requires us to implement the contingency plans we work so hard to develop.
*For more information about how you can continue supporting people affected by Hurricane Sandy, visit the Sandy Relief Fund page (http://sandynjrelieffund.org) and the United Way’s Sandy Response page (https://donate.unitedwaynyc.org/page/contribute/uwsandyrecovery).
Leading article image courtesy of NASA
About the Author
Mike Klein is the co-CEO of Online Tech (www.onlinetech.com), which provides secure, mission-critical cloud computing to mid-market companies across the U.S., with a focus on compliance specifically required for health care, financial and retail markets.