Virtualization has provided IT with huge benefits, from saving on capital expenses and energy to faster application deployment and workload agility. It also introduces a multitude of process changes and complexity, and mastering a virtual data center requires comprehending a dizzying array of software and hardware configurations, settings and workflows.
A good IT pro can generally make a virtualized data center run, but only a great IT pro can deliver extraordinary performance and availability while minimizing operational expenditures.
The difference is proactive visibility and analytics. A good virtual administrator knows which configurations will allow an application to run, but a great one has a holistic understanding of the entire infrastructure and the optimal configuration for assuring performance while minimizing cost and maintenance.
Going from good to great usually requires years of experience, but it’s possible to get there faster by using best practices and the experience of experts. With that in mind, here are the top seven tips that will enable administrators to go from good to great when it comes to optimizing even the most complex virtual data center.
Optimization tip #1: Right-size VMs for optimal performance and maximum VM density.
Right-sizing virtual machines (VMs) can save companies thousands of dollars while eliminating a constant battle with performance issues. In the early days of virtualization, VMs were often created using a copy-and-paste process from a physical computer, and physical-to-virtual (P2V) technologies fork-lifted many data centers into virtualization by converting physical machines bit by bit into virtual ones.
This approach is rarely a good strategy today. Although wasted resources on overpowered physical servers are an unfortunate and accepted cost of doing business, wasted resources on overpowered virtual servers create contention. Even vSphere’s highly touted memory-sharing features lose their effectiveness when VMs are configured with resources that aren’t being used.
While a good IT pro pays careful attention to CPU and memory usage, in addition to tuning the number of CPUs and quantity of vRAM to meet demands on the basis of several assumptions, a great IT pro always strikes the right balance between over- and under-provisioning resources.
Now any IT pro can strike this perfect balance using virtualization-operations-management technology that employs sophisticated algorithms based on a combination of best-practice resource-allocation advice and actual workload demand to right-size virtual-machine configurations. Capacity-management tools for virtualization help administrators discover which VMs are experiencing performance issues and need to be moved, which VMs need more resources and which VMs are wasting resources. The technology solution should be capable of clearly outlining workload placement and should report on how many more virtual machines can be placed on a particular host.
Optimization Tip #2: Keep it clean and control VM sprawl.
Wasted resources, such as abandoned and powered-down VMs, are challenging to identify in virtual data centers. They consume expensive software, hardware and storage resources and will continue to accumulate unless stopped. Eventually, VM sprawl and overcrowded VMs reduce performance and increase cost.
A good IT pro spends some time implementing policy controls, identifying wasted resources, and orchestrating their elimination. A great IT pro helps prevent VM sprawl by using VM lifecycle-management technology.
Automating policy controls and governance can deliver visibility into the lifecycle of VMs, eliminate error-prone and time-consuming manual and repetitive tasks, and support compliance with IT policies and standards. The technology tracks resources, facilitating better planning, and it flags resources that are ready for retirement, including
- Abandoned VM images that have been removed from inventory but remain in the virtual environment, consuming resources.
- Powered-off VMs that have been off for an extended period but remain in inventory.
- Unused template images that were used to create VMs but are no longer useful.
- Unnecessary snapshots, especially those that haven’t been modified for an extended period.
- Zombie VMs, which are unused VMs typically generated in self-service private-cloud environments and eventually abandoned.
Optimization tip #3: Ensure optimal storage performance.
Physical servers rarely exceed a fraction of their available bandwidth. The same isn’t true for virtual hosts. When dozens of VMs are consolidated onto a single host, those individual fractions start adding up. Determining the right storage configuration can make or break infrastructure performance and availability, but gleaning relevant information out of vSphere storage-performance statistics remains a challenge.
A good IT pro will “unsqueeze” storage connections by adding networking and disk spindles and by rebalancing VM processing. Great IT pros know exactly how storage is affecting the virtual environment by ensuring they have complete visibility into critical storage subsystems.
These pros are either storage IOPS masters or, more likely, they rely on technology to get visibility into VMs down to the physical disk spindle. A unified and logical end-to-end view of the virtualization infrastructure provides the information needed to understand how separate systems affect overall VM performance.
Optimization tip #4: Take advantage of vSphere’s DRS load balancing.
Planning is critical for any vSphere configuration activity, but overusing certain configurations can actually create problems. One area where overuse creates unexpected results is with resource reservations, limits and shares.
Resource reservations, limits and shares can be applied to individual VMs as well as to the resource pools containing VMs. Well-meaning virtual administrators sometimes use reservations and limits to constrain or guarantee resource use. But applying them in multiple locations at once can significantly complicate resource calculations and the effectiveness of vSphere Distributed Resource Scheduler (DRS) load balancing.
Resources in constrained pools are divided at the resource pool first, and constraints are enabled only after contention occurs. These rules mean that the innocuous “test resource pool” (with four VMs and 1,000 of the available 3,000 shares) won’t be a problem until the “production resource pool” (with 50 VMs and the remaining 2,000 shares) experiences contention. When this happens, the four test VMs will share one-third of the vSphere cluster’s resources, while the other 50 production VMs must share the other two-thirds.
In most cases, effective performance management requires a far lighter touch than vSphere’s hard constraints. It also requires a long-term view to understand the relationship between today’s activities and those in the past.
Good IT pros avoid reservations, limits and shares until they understand completely—and can continuously monitor—their effect on everything else. A great IT pro is an expert in high-availability best practices and focuses on cost-effectively increasing the baseline level of availability provided for all applications.
Understanding vSphere vMotion capabilities, as well as high availability (HA) and the DRS, helps eliminate downtime caused by hardware failures. A good performance-monitoring solution provides visibility across the IT stack and simplifies VM provisioning, resource allocation and load balancing, offering a significant shortcut to success with these complex virtualization technologies.
Optimization tip #5: Maximize energy savings.
With power costs draining data center budgets, figuring out ways to reduce unnecessary power usage is good for business. The typical data center is crowded with operating machinery, and traditional wisdom argues that servers must be powered on if they are to provide services. This isn’t necessarily true in many virtual data centers. VMs that aren’t providing services all the time don’t need to be powered on all the time. For example, desktop VMs don’t need to be operational when their users aren’t working.
Virtualization automation allows additional servers to be spun up and down on demand to address client demands. Still, determining which, and when, VMs can be powered down is an important challenge, as is automatically powering them up at just the right time.
A great virtual administrator thinks like a cloud administrator, using intelligent tools that facilitate cloud-like automation. With insight into the minimum number of host servers needed over time to safely run workloads, and estimates on the potential cost savings by powering down unneeded servers, a virtual data center can easily go green.
Optimization tip #6: Proactively manage capacity.
vSphere clusters are the foundation of VM performance and high availability. Clusters facilitate vMotion, which, in cooperation with VMware HA and DRS, ensures efficient use of resources. Clusters create problems, however, when they are not planned and implemented correctly.
Protecting every VM requires reserving one host’s worth of resources for each. These unused resources lie in wait for a cluster host failure, ready to process the VMs that existed on the cluster’s lost node. In practice, some clusters are built without the necessary reserve. Many more lose their reserves in the face of unexpected VM growth and unexpectedly nonexistent hardware budgets.
Disabling a vSphere cluster’s admission control might earn back resources for a last-minute request, but doing so can exacerbate downtime when hosts fail. Instead, the admission control policy should prioritize high-value workloads. If every VM does not needs vSphere HA protection, a percentage policy can be used to balance spare capacity with production needs.
The percentage policy makes sense when some VMs can experience downtime in emergency situations. Yet vSphere’s percentage policy requires extra planning and a regular checkup because the percentage of cluster resources to be reserved as failover capacity will change as the cluster grows or shrinks. Maintaining the best reserve balance requires vigilance in a dynamic environment.
A good IT pro takes reserves and cluster growth into consideration and makes changes as performance problems occur. A great IT pro knows that resource management in shared environments involves careful calculus that can be automated with proactive capacity-management tools and intelligent monitoring tools that deliver actionable alerts.
The right solution should also provide performance data and user-defined thresholds at all levels of the virtual infrastructure to assist in predicting how efficiently capacity is used. The solution should help in determining the remaining virtual machine capacity and should monitor capacity thresholds with cluster-based capacity utilization. Most important, the solution should provide single-click remediation of issues to save time and keep the environment running smoothly.
Optimization tip #7: Spend time modeling capacity scenarios and planning for the future.
Forecasting in IT has long involved a “gut feeling” approach, and when environments were simple, good IT pros could “feel out” future resource needs. Virtualization, however, particularly cloud computing, has dramatically increased complexity. In fact, today’s virtual environments have so many moving parts that they are becoming impossible to predict without assistance.
An operations-management solution should provide capacity trending, forecasting and alerting that will project time and resource consumption limits based on historical growth rates. The tool should also assist in analyzing “what if” scenarios by modeling both virtual and physical workload placement on available hosts to improve efficiency in virtual infrastructure or move workloads from physical to virtual infrastructure.
Managing a virtual data center today is undeniably a complex task. It requires not only optimizing resource usage today, but also keeping an eye on growth to plan effectively for tomorrow. Fortunately, becoming an expert virtual data center pro doesn’t have to take years. By following these best practices and deploying the right tools, every IT administrator can fast-track their progress from good to great.
About the Author
John Maxwell is vice president of product management for Dell Software’s virtualization-management solutions. His team is responsible for defining and developing market-leading virtualization-management solutions that span physical, virtual and cloud environments. His career in the data storage and data management industry covers two decades and includes executive management positions with companies such as Sterling Software, Veritas Software, Sun Microsystems and MTI Technologies.