Gone are the days when IT departments asked, “Are we cloud ready?” or “Can we use the cloud?” The vast array of cloud offerings now available ensures that almost any requirement can be met, and many of the data-protection and security features of these offerings are better than what most organizations can offer in their own data centers. So the question seems to no longer be if but how cloud services can be used.
And highly connected to the how is the why. Many start by thinking of the cloud as a cost-saving strategy, but they quickly realize that it may not always be so. Shifting from a capex-based on-premises model to an opex-based rental model in the cloud is certainly useful, but not if the monthly costs add up to more than the on-premises virtual environments they replace. It takes diligence, planning and proper governance to make sure that the financial side is kept under control.
This reality often runs counter to what is probably the biggest driver of cloud adoption: agility. The ability to rapidly respond to new business requirements and opportunities has been the holy grail of IT for a long time, and the cloud is one way to achieve this goal. Compared with the long planning and procurement cycles of on-premises infrastructure, instant capacity at your fingertips has an obvious attraction. In fact, a main goal of the traditional capacity-management function is to “pre-buy” infrastructure so it can be on the floor ahead of anticipated need, even if this anticipated need is sometimes wildly inaccurate. But what the capacity-management process lacks in accuracy it makes up for in governance and control, and everything that happens has a reason. Shifting to the cloud clearly disrupts this process, and the question becomes how to use this new agility while still maintaining some semblance of control.
The term often used for this type of control is governance, and although many view this process as a buzzkill and would rather focus on the newest technologies, it is extremely important. Some organizations charge headlong into cloud usage only to realize that their initial use is out of control, and the sticker shock from the initial bills causes them to step back and put governance in place. Others recognize this issue up front and actively implement policies and automation strategies that let them make the right decisions from the start. Either way, it’s going to happen.
And this governance shouldn’t be based on opinions or preconceived notions. What can and should go into the cloud, and which cloud hosting model is best, can be scientifically determined with the right processes and right software in place. Applications and business services have specific requirements, and if these requirements are captured properly, it’s possible to make high-level “hosting venue” decisions in an automated, policy-driven way. Several things require consideration when automating these decisions.
Fit for Purpose
The first consideration when making hosting decisions and governing the use of the cloud is whether a provider’s infrastructure is fit for the purpose of the workloads in question. Many start their journey to the cloud with dev/test workloads, where this question isn’t a big concern because typically no sensitive data is involved and no specific requirements exist with respect to performance, availability, long-term hosting cost and so on. But many current business services, or new ones that graduate into production, have all of these considerations, and it’s critical to analyze these requirements against the capabilities of the infrastructure to determine whether it is fit for purpose. The question isn’t always whether the cloud can be used, but rather which provider is most suitable given its offerings, locations and so forth. The cloud has evolved to the point where you can buy almost any type of infrastructure if you pay enough money, but if your application contains personally identifiable information or makes numerous round-trip calls to your mainframe, you can’t just put it anywhere you like.
Some specific aspects require consideration when determining fitness for the purpose at hand:
- Service catalogs: Most cloud providers sell capacity on the basis of a catalog of offerings, which includes the sizes of instances available as well as the software pre-installed on them. These catalogs vary widely between providers, and some heavily restrict the variations available for purchase. If a provider lacks a large enough “large,” your app can’t run there, and if it lacks a small enough “small,” or the sizes don’t match your workloads, you will overpay. The same is true of the software stacks, and if you have to upgrade your app to be compatible or bring your own database license to the party, things can get expensive very quickly.
- Physical locations: Many apps have jurisdictional limitations that depend on data residency, PII, ITAR and so on, and these limitations place hard constraints on where data can go. Performance and latency concerns also place critical constraints on hosting venues and require “service proximity” to be a major consideration when deploying applications into the cloud. Complex business services with multiple communication paths, or any app that connects to on-premises systems of record, must be carefully considered in this regard.
- Business policies: It’s common for business groups to think that they’re special when it comes to the infrastructure they need, and although this perspective is more often a perception than reality, in some cases it can be true. If a group is doing real-time trading or is preparing SEC filings, you can bet it will have some very specific constraints on where its apps can run. It’s important to capture and codify the objective criteria that govern the placement of a department’s applications, which may not be as simple as “keep it in the U.S.” or “make sure it’s backed up.”
These are but a few examples of fit-for-purpose criteria that must be captured and embedded in every cloud-hosting decision, either app by app or as a blanket governance policy for a line of business. And because a primary cloud driver is agility, this capturing and embedding must be done in a way that doesn’t impede access to resources, so consumers can get the resources they need when they need them. In other words, it’s fundamental to cloud automation.
Of course, getting all of this right is not particularly useful if it comes at a huge cost. Many anecdotes describe organizations that successfully launched cloud-based applications only to have their bills be 2–3x what they anticipated. Underlying such anecdotes is a fundamental observation that should underpin every cloud decision: public clouds aren’t cheap.
Now, before every cloud provider gets up in arms, it’s important to explain this remark. Renting a medium-size cloud instance for 25 cents an hour is cheap, provided you only use it for an hour. But if you use it nonstop for a year, the cost adds up to $2,190, and it may be a lot less expensive to simply throw it into your on-premises virtual environment.
Understanding this situation, and automating hosting decisions on that basis, is hugely important to the financial-governance process. Several important considerations go into cost optimization:
- Right-sizing: Most organizations aren’t great at understanding their workload patterns and “right-sizing” VMs and cloud instances on the basis of what they actually need. In virtual environments, this shortcoming is understandable, as overcommitment allows the hypervisor to claw back virtual resources that aren’t being put to productive use. But in the cloud, this type of sloppiness can cost you dearly; if you’re paying for a 50-cent instance for a year when a 25-cent instance will do, you clearly aren’t being diligent enough. Particularly considering that this is now a monthly expense, not a sunk cost.
- Best execution venue: All the right-sizing in the world won’t help if you are running an application on the wrong type of infrastructure to begin with. Consider the case where a workload has a high peak at 9:30 a.m. and again at noon or later in the afternoon, but it sees only minor activity in between. You may need to rent a large instance to meet the demands, even though you will have a lot of “white space” and the resources will go unused most of the day. By contrast, running the same application in a bare-metal cloud using a hypervisor would allow the workload to dovetail with other apps that have complementary utilization patterns, soaking up that spare capacity and putting it to use. This is the fundamental purpose of overcommitment, and a recent study showed that typical enterprise workloads can be less than half the cost to run in bare-metal clouds than in “tee-shirt sizing” IaaS offerings.
- Demand-management and planning: Even though the new cloud world enables a shift away from the long-lead-time “pre-buy” model of capacity management, it may be short lived. Many vendors now offer the ability to purchase reserved instances or bare-metal servers, and upfront commitment is rewarded with lower hosting costs—that is, if you can figure out what you need ahead of time. Properly understanding the existing body of applications and their workload patterns, as well as what’s coming down the pipe, enables the next level of financial optimization in the public cloud. If this situation sounds familiar, that’s because it is.
The ability to apply a scientific method to governing cloud-hosting decisions is critical to ensuring business services run properly, data is secure, new apps can be rapidly deployed and costs are kept to a minimum. Some organizations charge into cloud and learn this lesson the hard way, but others think it through and try to strike the right balance the first time. Either way, the ability to minimize the pain and maximize the financial gain of moving to the hybrid cloud is truly the next frontier in infrastructure management.
About the Author
Andrew Hillier is CTO and cofounder of CiRBA. He has over 20 years of experience in the creation and implementation of mission-critical software for the world's largest financial institutions and utilities. A cofounder of CiRBA, he leads product strategy and defines the overall technology roadmap for the company.
Before CiRBA, Andrew pioneered a state-of-the-art systems-management solution that was acquired by Sun Microsystems and served as the foundation of that company’s flagship systems-management product, Sun Management Center. He has also led the development of solutions for major financial institutions, including fixed-income, equity, futures and options, and interest-rate derivatives-trading systems, as well as in the fields of covert military surveillance, advanced traffic and train control, and the robotic inspection and repair of nuclear reactors.
Andrew holds a Bachelor of Science degree in computer engineering from The University of New Brunswick.