The era of big data is upon us. IBM estimates that 90% of the data in the world today was created in the last two years alone. By 2020, the International Data Corporation (IDC) predicts that the amount of digital information created and replicated in the world will grow to almost 40 zettabytes (ZB)—more than 50 times what existed in 2010 and amounting to 5,247 gigabytes for every person on the planet (see Figure 1). Much of this burgeoning information resides in data centers managed either by businesses or by the many external storage providers offering cloud-computing environments.
Chances are, if you pick up or log on to any business periodical today, you will find an article about using information in the big data evolution—how to process it, how to analyze it, how to act on it and even how to establish trust in it. But data centers also must focus on how to dispose of big data once it has fulfilled its useful life. A growing number of industry standards and government regulations must be met, and new drivers like customer demands for sustainability are putting a premium on data erasure so hardware is “sanitized” before it is reused or resold.
With the average cost per compromised record in a data breach at $194 million, the stakes are high, especially when considering damages to reputation or potential fines for companies that must comply with regulations like the Health Insurance Portability and Accountability Act (HIPAA) and industry standards like the Payment Card Industry Data Security Standard (PCI DSS).
Advantages of Advanced Data Erasure
To ensure complete data security or to procure work in regulated and audited industries such as health care, finance and retail, data centers should thoroughly sanitize data from hardware slated for reuse, resale or disposal as a best practice. This good-housekeeping practice should extend to targeted data on active systems in some cases. What other critical issues must data centers consider? How do you ensure your data destruction plan is on target—not just secure and compliant but also efficient in its use of your resources and time? How can you be confident that your data erasure processes are thorough enough to meet the toughest requirements, with nothing “falling through the cracks”?
First, a definition: data erasure sanitizes hardware using a software- or firmware-based process that overwrites the data with a pattern of ones and zeros, so data can no longer be retrieved. But all erasure tools are not created equal. More data centers today are turning to advanced data erasure software as a best practice for three key reasons:
The advanced solutions not only automate the data erasure process, but they also work in a variety of common mass-storage hardware and configurations. They can target specific files or logical units (LUNs) for erasure in an active working environment without causing any downtime.
Advanced data erasure technology is certified to all major international erasure standards, protecting sensitive customer data while enabling compliance with standards and regulations. Advanced erasure tools available today support compliance with industry regulations like PCI DSS and HIPAA, as well as technical erasure standards like those from the U.S. Department of Defense (DoD) and others.
Advanced data erasure tools automate and centralize housekeeping tasks with minimal impact on data center staff productivity, including the automatic production of auditable erasure reports for regulatory purposes.
Meeting the Challenges of “Big Data” Center Erasure
A key aspect of compliance is the auditable erasure report, which documents thorough removal of data at critical transaction points that indicate “data end-of-life.” Data end-of-life can occur because of hardware reassignment or resale, disaster or backup recovery tests, facility relocation, planned data migration, or removal of customer data at the end of a contract, for example.
The erasure report should provide specific hardware details, including serial number, number of server drives, size and speed, as well as information about the erasure process, such as how long it took and who performed it. These reports are especially critical to proving that data was removed from equipment slated for retirement or transfer.
Advanced data erasure tools on the market today provide these thorough reports. In addition, because they take a five-tier approach to data erasure, they offer “blanket coverage” for the most common varieties of mass storage hardware and configurations. Here are examples of how these five tiers—for files, LUNs, disks, servers and storage arrays—work to meet today’s challenges in the typical data center environment.
Because advanced data erasure tools can effectively destroy individual files on a time- or event-driven basis, they are ideal for compliance with data security standards like PCI DSS that require deletion of file-level data at specific intervals—all while the host system remains active. The tools also erase files flagged by users or systems administrators, who select what rules and storage areas apply from a central interface. No temporary files are left behind as a source of potential data loss. The solutions can be monitored as they do their work, and all file destruction operations are logged.
Of course, data centers with high availability requirements save multiple copies of the same data file for redundancy purposes. Administrators need a centralized way to execute remotely the erasure of targeted and duplicate files, as well as of folders on servers and in storage areas across the network. Advanced data erasure tools safely and securely provide this centralization.
In today’s cloud-computing environment, data centers need secure, cost-effective options for reusing enterprise storage system configurations without rebuilding them. To achieve this safely, administrators need a centralized tool that can erase logical drives like LUNs in an active storage environment where the host computer cannot be rebooted. The tool should be able to support remote management of LUN erasure from the application server and should support simultaneous shredding of multiple units.
Advanced data erasure tools work well to meet end-of-hosting subscription requirements, when erasure is necessary for LUN reuse in a hosted environment if a current customer leaves and a new user is assigned to an existing LUN. In addition, the tools can help support sound practices after disaster recovery tests, when multiple copies of LUN data exist and must be erased for security reasons, and also after back-up recovery tests to make sure the hardware is secure for the next client.
For convenience, advanced data erasure tools offer LUN versions that support simultaneous data destruction on multiple units by starting parallel instances of the software. The software can erase any unit (physical or logical) that a Linux, Unix, or Windows system can detect by overwriting the entire writeable area, sector by sector, on the logical disk or drive.
Disk-level erasure is necessary for sanitizing hard disks outside the original host (Figure 2), as with loose drives from storage-area network (SAN) servers. Many of these are return-material-authorization (RMA) drives that need erasure before returning to the original equipment manufacturer (OEM) under warranty. Because of handling requirements and chain-of-custody concerns, local erasure of disks is necessary, and advanced data erasure tools support it.
Figure 2: Advanced data erasure tools centralize erasure of loose hard disks outside the original host.
Similar to full array erasure, erasing loose drives requires an external host/boot device and the correct connectivity between the host and the drives to be erased. Once erasure is in progress, an erasure tool should support monitoring and final erasure reporting across the network, when network connectivity can be used.
Advanced data erasure tools also perform well in other individual disk erasure scenarios. In replacing RMA warranty drives, on-site erasure of “failed” disks removes the disk content so that the drive can be transported risk-free to the OEM for warranty replacement, avoiding costly disk retention fees.
If secure end-of-life erasure processes were not used in the past, advanced data erasure tools can help a data center erase a backlog of drives with sensitive data. They can also help data center asset managers avoid headaches by erasing unsecured data on loose drives in cases where swapping—using loose drives as replacements—is a common process to expedite retirement of a server.
Full server erasure involves erasing all connected drives, internal or external, using the server itself as the erasure platform. Advanced data erasure tools perform service-level erasure either locally or remotely. For example, remote erasure is easily implemented using a virtual CD drive for servers with iLO/IPMI/DRAC capabilities.
For complete security, data centers need erasure tools that detect hidden and remapped sectors during the server erasure process, flagging those that cannot be erased. Depending on policy and risk tolerance, data centers may refurbish or resell servers with only a few bad sectors, or even choose physical destruction. Either way, data erasure must occur before a server leaves the premises.
Example scenarios for erasing entire servers include the end of a hardware refresh cycle, when data centers must securely erase all information on servers to comply with regulations and protect customers. This approach allows resale and recycling of healthy disks, while creating a green data center environment and profit streams. Erasure is also necessary for server reuse in a hosted environment when an existing customer terminates hosting services.
In addition, data centers frequently move or expand, requiring relocation of servers that, if not securely erased, could result in data loss during transport. As with disk-level erasure, advanced data erasure tools are available to perform all these functions. They can erase x86 and x84 servers, as well as RAID and non-RAID servers.
It is important to note that today’s computing environments build on virtual machines to a large extent, supported by VMware and Microsoft. In addition to erasing physical machines, the ability to erase stored data associated with virtual machines in these environments is an important feature of advanced data erasure solutions at the server level.
Data centers work with a broad range of complex storage configurations that can yield revenue upon retirement. Decommissioned SAN disks and other mass storage devices can potentially be sold if data is securely removed. To eliminate the need for multiple erasure products, data centers with high-end server and SAN environments need a tool that erases a broad range of hardware, such as Serial ATA, SAS, SCSI, and Fibre Channel disks. Owing to the scale of data centers, simultaneous erasure of multiple disks is a necessity. Advanced data erasure software offers 100% secure data destruction for high-end storage arrays.
It can prove valuable in many data center scenarios, including at the end of a hardware refresh cycle, when data must be erased before transporting storage systems back to the leasing company. Keeping the drives is cost prohibitive, as is physical destruction, because of heavy lease-settlement fees if equipment is retained. The data center—not the OEM—owns the data and is responsible for its erasure to prevent data leaks. Advanced data erasure tools also perform in cases where data centers move or expand, requiring relocation of storage system that, if not securely erased, could result in data loss or breach during transport.
Capable of simultaneous erasure of 200+ hard drives, data center versions of advanced data erasure software provide fast data erasure for ATA, SATA, SCSI, Fibre Channel and SAS hard drives, and they can function with or without the LAN. The software can erase 512-, 520-, 524- and 528-sector-size hard drives, as well as provide extensive SCSI and Fibre Channel host-bus-adapter support for connecting external hard-drive enclosures for direct erasure.
Reliable Erasure Tools for Big Data Environments
As the scenarios show, data centers are complex hardware environments with a variety of data erasure needs that will only grow more complex in the era of big data. Granted, big data offers exciting opportunities for every data center. But there are risks, as well. Savvy asset managers understand those risks and mitigate them using solutions and processes for sound, secure and compliant data erasure.
About the Author
Markku Willgren is president, US operations, for Blancco, a global leader in data erasure and computer reuse solutions.
Leading article photo courtesy of Christian Jansky
 IBM, retrieved from http://www-01.ibm.com/software/data/bigdata/, April 7, 2013.