Many of us are grappling with the modern demands of digital business: developing new mobile apps, evaluating security in the face of IoT, moving to hybrid clouds, testing approaches to defining networks through software. It’s all part of the hard trend toward service-oriented IT, with a primary goal of delivering a premium user experience to all your users—internal, partner or customer—with the speed, quality and agility the business demands.
How do you meet these elevated expectations? As modern data centers evolve rapidly to tackle these agility demands, network and application architectures are becoming increasingly complex, complicating efforts to understand service quality from infrastructure and application monitoring alone. Virtualization can obscure critical performance visibility at the same time complex service dependencies challenge even the best performance analysts and the most effective war rooms. Although this situation may read like a recipe for disaster, within are secrets to success.
Service Quality Is in the Eye of the End User
Remember the adage “beauty is in the eye of the beholder”? The same idea applies here; service quality is in the eye of the user. It’s hard to argue with that sentiment, especially when we consider the user as the face of the business. So, of course, to understand service quality we should be measuring end-user experience (EUE), where EUE is defined as the end-user response time or “click to glass.” In fact, EUE visibility has become a critical success factor for IT service excellence, providing important context to more effectively interpret infrastructure performance metrics.
In fact, you may already be measuring EUE. Some of your applications—particularly those based on Java and .NET—may already be instrumented with agent-based APM solutions. But there are many challenges to an agent-only approach to EUE, including the following:
- These agent-based solutions may be unavailable or unsuitable for operations teams
- Not all Java and .NET apps will be instrumented
- Some agent-based solution do not measure EUE
- Some agent-based solutions only sample transaction performance (let’s call this some user experience, or SUE)
- Many application architectures don’t lend themselves to agent-based EUE monitoring
An Important Lesson
For these and other reasons, IT operations teams have often focused on more approachable infrastructure monitoring—device, network, server, application and storage—with the implication that the whole is equal to the sum of its parts. The theory was (or still is) that by evaluating performance metrics from all of these components, one could assemble a reasonable understanding of service quality. The more ambitious IT teams combine metrics from many disparate monitoring solutions into a single console, perhaps with time-based correlation if not a programmed analysis of cause and effect. We might call such a system a manager of managers (MOM), or business service management (BSM). Some still serve us well, likely aided by a continual regimen of care and feeding; still more have faded from existence. But we have learned an important lesson along the way—namely, EUE measurements are critical for IT efficiency for many reasons, such as
- Knowing when there is a problem that affects users
- Prioritizing responses to problems on the basis of business impact
- Avoiding chasing problems that don’t exist, or deprioritizing those that don’t affect users
- Troubleshooting with a problem definition that matches performance metrics
- Knowing when (or if) you’ve actually resolved a problem
Complexity Drives APM Evolution
Performance-monitoring capabilities continue to mature, evolving from real-time monitoring and historical reporting to more sophisticated fault-domain isolation and root-cause analysis, applying trending or more-sophisticated analytics to predict, prevent or even take action to correct problems.
One of the compelling drivers is the increasing complexity—of data center networks, application-delivery chains and application architectures. And with this complexity comes an increasing volume of monitoring data stressing, even threatening, current approaches to operational performance monitoring. It’s basically a big-data problem. And in response, IT operations analytics (ITOA) solutions are coming to market as an approach to derive insights into IT system behaviors—including but not limited to performance—by analyzing generally large volumes of data from multiple sources. The ITOA market insights from Gartner tell an interesting story: spending doubled from 2013 to 2014 to reach $1.6 billion, while estimates suggest that only about 10% of enterprises currently use ITOA solutions. That’s a lot of room for growth!
Everything Old Is New Again
The value of ITOA goes beyond our focus here on incident and problem management—for example, it can offer important value for change and configuration management. But let’s focus on the use of ITOA for performance management; data sources could include system logs, topology information, performance metrics, events and so on from servers, agents and probes. The information is stored, indexed and analyzed to accomplish important goals such as identifying trends, detecting anomalies, isolating fault domains, determining root cause and predicting behavior. Is it starting to sound familiar? The resemblance to earlier MOM-like efforts to combine disparate monitoring data is striking. That’s not to downplay the many capabilities and analytic promises that ITOA makes, such as machine learning, that should give it stronger legs; it’s simply to point out an obvious similarity. And in fact, ITOA is often talked about as the future of APM.
But Didn’t We Learn a Lesson?
When we consider application performance, even the most considered ITOA implementations will come up short if they lack the end-user experience metrics. IT efficiency and business alignment, which is critical for effective service orientation, requires the context of the end user’s experience; to ignore it is to skip a big step towards maturity.
Delivering a Superior End-User Experience
As a competitive provider of application services, how well does your organization understand end-user experience? As we forge ahead to take on the evolving data center and the changing complexities to network and application architectures, the key is to put performance metrics in the context of the end-user experience.
By prioritizing the problems that are directly affecting users, IT teams can help equip their organizations with the insights needed to deliver superior application performance to their end users. With the right solutions, teams can have a unified view of their service quality to enable effective collaboration across the broader organization.
For additional perspectives on the importance of end-user experience monitoring and the risks of ignoring it, download the e-book What Component-Centric Monitoring Won't Tell You.
About the Author
Gary Kaiser is a subject-matter expert in network performance analytics at Dynatrace and is responsible for DC RUM’s technical marketing programs. He is a co-inventor of multiple performance-analysis features and continues to champion the value of network performance analytics. He is the author of Network Application Performance Analysis (WalrusInk, 2014).