I recall that the Dec 1999 or Jan 2000 issue of National Geographic magazine had a “Letters From the Editor” column that speculated, in jest, that given the rate at which humans were saving back issues of National Geographic, by the year 2100 the total accumulation of yellow magazines would outweigh planet Earth. (Note: the public archives of National Geographic appear to only go back to 2005, so I can’t verify the exact issue in which this comment appeared.)
Anyway, that statement resonated with me, because although I change residences every few years, it has only been recently that I hadn’t packed and carried my decades of accumulated National Geographic magazines with me. Now that I’m free of them, I have no idea why I schlepped them with me for so long. Worse, it cost real money to do so; movers charge by weight. One mover commented that he was certain that two-thirds of the weight of all my possessions were books and National Geographic magazines, as he handed me an $8,000 bill for the move.
I now collect books on Kindle. And I dropped off my boxes of yellow National Geographic magazines at the Goodwill store, in the middle of a dark and shameful night, almost a decade ago. I don’t know if it was a particularly “green” decision, but I know that my recent moves have been the easiest since I was an undergraduate.
Likewise in data centers. If we keep doing business as we have for the past 30 years, quite soon the planet would tilt off its axis owing to the sheer weight of data storage hardware.
The advent of virtual machines has had a profound impact on provisioning environments. Instead of unpacking, racking, wiring, powering and cooling physical servers, data centers can now create virtual machines by the hundreds by pointing and clicking. All of these new virtual machines share the previously underutilized CPU and RAM resources of physical servers, making the return on investment (ROI) on these resources sky high.
So, virtual-machine technology has allowed data centers to provision several million virtual servers without having to power and cool several million physical servers. They use the existing physical servers far more efficiently, fully utilizing previously underutilized resources.
That is “green.”
But is server virtualization alone really the total solution for optimizing resource utilization? Consider the fact that each virtual machine still requires a full image of storage. Like servers, storage is a huge part of data center infrastructure. Just as virtual machines are likely “clones” of one another, used for testing, development or training, so too is much of that full image of storage for cloned servers merely a “clone” or copy. That’s a lot of redundancy, especially if those clones, over their lifetime, only differ from the original by a small percentage. The base OS software is the same as the original copy, except for a few configuration files; the application software—same as the original copy, except for a few configuration files. Even the databases—same as the original copy, except for a relatively small percentage of the total resulting from update activity. So, as several million virtual servers have been spun up, many of which as largely identical copies from server templates, each has likewise required a full complement of disk storage, thus driving the already overheated computer storage industry into supernova.
For many, the time has arrived when server virtualization has completely taken over, even for those situations where sharing CPU and RAM resources are not desired. For example, in mission-critical production environments, it is very common to have virtual machines one-for-one with physical machines to ensure that CPU and RAM resources are available in abundance—overprovisioning to deal with growth and peaks in demand. Having production application encapsulated in virtual machines makes it easier and simpler to move to other physical servers, whether to address resource shortfalls or to deal with physical server failure.
In these situations, data virtualization does not yield any benefit, green or otherwise. Storage for mission-critical applications is likewise worthy of overprovisioning to deal with growth and peaks.
But in the scenario where a couple, dozens or even hundreds or thousands of virtual machines are provisioned to a cluster of physical servers, we have an environmentally unsustainable model, in every sense of the phrase, because each of those new virtual machines needs a full copy of storage. Fewer servers, yes. Provisioned more quickly, yes. But more storage needed than ever before, requiring more power, cabling, cooling and floor space.
In this scenario, an analogy for server virtualization without data virtualization would be an advance in technology to build automobiles entirely from cheap renewable resources, such as cellulose. Hey, terrific: instead of building cars from expensively mined resources such as metals and exhaustible resources such as plastic, how about a leap in technology where we could employ cellulose waste from food production, mainly biomass left over from farming—the stuff we throw away or plow back into the ground.
Perhaps we would also have found a way to get rid of all those old back issues of National Geographic?
We could then produce these cars more cheaply and with less environmental impact, using what is essentially mulch, for a fraction of the cost of currently manufactured automobiles. It would be the golden age of personal transportation. Everyone on the planet could afford one. But what if these new automobiles still used internal-combustion engines, consuming fossil fuels at the same level of efficiency as today—about 20–40 miles per gallon? Even if they were more efficient, at upwards of 100 miles per gallon, would they yield a net benefit to the environment?
Of course not. The proliferation of these inexpensive, environmentally friendly automobiles would be an utter disaster environmentally, as the consumption of fossil fuels skyrocketed. The oil companies would be quite happy, wouldn’t they?
That is server virtualization without data virtualization, except it is the storage vendors in the place of the oil companies. And oil companies and storage vendors are even closer analogies to one another, in that both are reaching the limits of the resource they provide using existing technology.
Data virtualization leapfrogs and multiplies the incremental improvements in storage, introducing real agility and increasing the tempo of development operations, ultimately utilizing storage resources more efficiently. Server virtualization has been a huge initial step, but data virtualization is the final step to fully deliver on the promise of infrastructure virtualization.
Fast, good, cheap. And green.
About the Author
Tim Gorman is a senior technical consultant. Tim has worked in IT as a “C” programmer since 1984, as an Oracle application developer since 1990 and as an Oracle database administrator since 1993. He is a technical consultant for Delphix, which enables data virtualization to increase the agility of devops, IT development and testing.
Tim has co-authored six books and performed technical review on eight more books. He has been an Oracle ACE since 2007 and an Oracle ACE Director since 2012, as well as a member of the Oak Table Network since 2002, and he has an author’s page on Amazon. Tim has presented at Oracle Open World and Oracle users groups in lots of wonderful places around the world.