The average corporate turnover rate for employees is 15.1 percent across all industries, with some specific verticals experiencing as high as 30%. For an organization with 10,000 employees this can account for 1,500 to 3,000 people annually (Compensation Force: 2013 Turnover Rates by Industry).
When an employee leaves an organization the IT department will typically wipe or recycle their hard drive, containing their digital files and email, however, they neglect to clean and manage former employees’ data on corporate networks and servers.
For this scenario, a company of 10,000 looking at the conservative annual turnover of 1,500 employees, this could account for easily 60 TB of data that is abandoned in the data center each year. Over 10 years this explodes to beyond half a petabyte.
Abandoned data is unstructured files, email and other data owned by ex-employees that languishes on networks and servers. Gartner estimates that the 2013 average Annual Storage Cost per Raw TB of capacity is $3,212 (Gartner: IT Key Metrics Data 2014: Key Infrastructure Measures: Storage Analysis: Current Year, Dec. 2013). This can account for millions of wasted expenses each year.
Abandoned data consists of old working documents that have long outlived their business value: revisions of letters, old spreadsheets, presentations and aged email. However, a small percentage of this content can easily contain sensitive files and email. It is this small percentage of contracts, confidential email exchanges, client records and other similar documents, which adds a level of risk and liability for the corporation.
The bulk of the data is typically what is known as redundant, outdated and trivial content - or ROT - that is simply taking up space and resulting in unnecessary management and data center costs.
The following are factors you will need to take into account in order to understand the cost impact of abandoned data:
Risk and Liability
The number one expense associated with abandoned data is the legal exposure created by not managing abandoned user data. The risk and liability inherent in sensitive data including client records, personally identifiable information (PII), or records required for eDiscovery or compliance can cost a company millions along with unwanted negative press and exposure.
Managing sensitive records is always a challenge. However managing this content when the owner of the data is no longer an employee and no one knows it exists is an even more complex challenge. Think of the CEOs former admin creating a PST archive of their email and storing it on some obscure server. It is difficult to put a value on this exposure, but it is something that should be keeping your legal and compliance teams up at night.
In the example above 60 TB of abandoned data can exist on corporate servers for a company of 10,000 employees. At the same time this data is cluttering the data center, organizations are increasing their storage capacity at a rate of 40-60 percent annually. Reclaiming this capacity and cleaning up abandoned data, most of it can disappear tomorrow and no one would miss it, is equivalent to getting free storage capacity. Since most IT budgets are decreasing, this is an easy approach towards making every dollar count.
Backup and Disaster Recover
One of the hidden costs of not managing and controlling abandoned data is in corporate disaster recovery costs. The cost and resources required to ensure all data is backed up and protected is one of the more expensive line items on an IT budget.
Compressed backup windows, offsite storage costs and management of backup content all contribute to ever-growing data center resources. With abandoned data accounting for tens, even hundreds of terabytes, it has become a significant component to the expenses associated with disaster recovery. Assuming a conservative 15 percent of data that is backup up no longer has any business value annually and should be moved offline or even remediated, this can easily reduce disaster recovery costs and expenses by up to 50% on a server over five years old.
Data is constantly migrated to new platforms, or consolidated in order to streamline operations. Migrating and consolidating data is a constant and painful operation. It becomes even more painful when you know that much of the data no longer has value. If 30-50 percent of the data from a five year old storage platform is migrated to a new storage platform, or even the cloud, is owned by ex-employees, much of this effort is wasted.
Beyond a migration of data, day-to-day management of servers is a key task in any corporate data center. Reducing the volume of data under management will have a lasting impact on budgets and resources required to support the explosive growth of unstructured user data.
Data profiling, also known as file analysis, uncovers abandoned data so it can be managed. Understanding what abandoned data exists is the first step in defining a data policy that can reclaim wasted expense and control long-term risk and liability of this unknown and unmanaged content.
In “Market Guide for File Analysis Software”, published September 23, 2014, Gartner recommends profiling data to gain a better understanding of the unstructured data environment and ROT including abandoned data, stating:
“Data visualization maps created by file analysis can be presented to other parts of the organization and be used to better identify the value and risk of the data, enabling IT, line of business, compliance, etc., to make more-informed decisions regarding classification, information governance, storage management and content migration. Once known, redundant, outdated and trivial data can be defensibly deleted, and retention policies can be applied to other data.”
Data profiling works by processing all forms of unstructured files and document types, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and, optionally, what key terms are in it.
High-level summary reports allow instant insight into enterprise storage providing never-before knowledge of data assets. Through this process, mystery data can be managed and classified, including content that has outlived its business value or that which is owned by ex-employees and is now abandoned on the network.
This simple and analyst-recommended process helps organizations reclaim up to 40% of active data capacity and mitigates legal and compliance risks associated with unmanaged data.