Data is growing at an exponential rate. As one expert recently put it, its growth rate is equivalent to placing a single grain of rice on the first square of a chess board, two grains on the second square and so on. At number 64, the final square would hold a mound of rice equaling a thousand times the annual rice production of the entire world.
As companies in all sectors—including health care, finance, retail and government—attack with gusto the problem of how best to monetize this mountain of data, vendors are designing and delivering applications that make it even easier to tap into the vast wealth of big data and extract it for use across a spectrum of departments and business functions. This confluence results in a rapid scaling of big data usage across all sectors of the economy.
Yet as big data apps connect with data silos throughout the network environment, it’s clear that security measured to protect companies’ most sensitive information are not receiving the same priority. To ensure compliance and cut risk of hacker theft, companies should take steps to protect the integrity of their data assets as they move to take advantage of big data.
To derive analytics from big data, large data sets are divided into smaller high-fiber analytics pieces, processed individually across a Hadoop cluster and then regrouped to produce usable information. The process is almost entirely automated, requiring a great deal of machine-to-machine (M2M) communication across the network.
Several levels of authorization occur in Hadoop infrastructure; specifically:
- Access to the Hadoop cluster
- Intercluster communications
- Cluster access to the data sources
These authorizations are often based on Secure Shell keys, which are ideal for use with Hadoop because of their level of security and support for automated M2M communications. Many popular cloud-based Hadoop services also use Secure Shell as a means of authentication for access to the Hadoop cluster. Securing the identities that grant access to the Big Data environment should be a high priority, but it can be challenging. For anyone looking to use big data analytics, the following questions should be considered:
- Whose responsibility is it to set up the authorizations for big data analytics?
- How are these authorizations managed correctly?
- Who can access these specific authorizations?
- What happens to the authorizations if the initial creator leaves the organization?
- Does the “need to know” security rule directly affect the level of access the authorizations provide?
These questions aren’t unique to big data. In fact, they are becoming more important as automated business processes increase across data centers. Automated M2M transactions account for more than 80 percent of all data center communications, yet the bulk of administrators focus on the 20 percent associated with human accounts. Certain industries that rely heavily on data, such as financial and cloud-based services, typically have a four-to-one ratio of machine-based identities to interactive human ones. So why is this larger set of identities being ignored? With the sheer amount of big data only rising, it’s clear that a heightened sense of urgency is warranted in dealing with the management of M2M identities.
Business Risks Abound
The risks of ignoring M2M identities are very real, and mismanagement of these authorizations can lead to severe data breaches. Although securely managing end-user identities has seen significant progress, machine-based ones are sorely neglected, resulting in a far-reaching attack vector in the IT environment.
This neglect can be attributed partially to the challenges of implementing change on a running system. Bringing central identity and access management to thousands, if not millions, of machine-based identities is a complicated endeavor. As the risk of inaction continues to rise, new tools and processes to combat this risk are needed.
Currently, IT administrators use a manual method for tracking authentication keys used for securing M2M transactions. Antiquated methods like spreadsheets or homegrown scripts are popular choices for monitoring, distributing and inventory-checking keys. As with all manual management, human error lends itself to mistakes, and many of these deployed keys fall through the cracks. This approach usually lacks regular scanning, so backdoors can be inserted without knowledge of the system administrator.
Compliance Means Business
Compliance standards across a variety of industries mandate central control over these authentication keys. Failing to do so can put the organization at risk for hefty fines and reputation damage should noncompliance be identified. For example, the recently enhanced PCI standards require that any organization that accepts payment cards must maintain strict control over who has access to sensitive financial information. This requirement affects banks, restaurants, retail and health care; and as such, many verticals are making quick changes to their security posture to minimize risk of a breach or noncompliance.
Steps to Heighten M2M Network Security
To combat these risks, the following steps are recommended best practices:
- Discover: Data center administrators, security teams or identity- and access-management staff rarely have visibility into where identities are stored, what information the identities permit access to and what business processes are supported. An important first step of the remediation process is passive, non-invasive discovery.
- Remediate: With visibility and control established, necessary identities that are in violation of policy can be updated without disrupting ongoing business processes. For example, a machine identity could be supporting an active process, but it may have a higher level of privilege than is needed. Using central management, the privilege level assigned to that identity could be remediated.
- Monitor: Constant monitoring of the network environment is needed to determine which identities are in active use and which are associated with inactive users or processes. The upside is that in many enterprises, unused (and therefore unneeded) identities often comprise the majority. Once these unused identities are located and removed, the scope of the process is significantly reduced.
- Manage: Central control over adding, changing and removing machine identities should be implemented. This approach enables policy-based governance over how the identities are used, ensures no more unmanaged identities can be added and provides verifiable proof of compliance.
Of course, Rome wasn’t built in a day, and neither can complete security be achieved all at once. The tools and processes that facilitate these best practices each day provide proactive risk mitigation and compliance.
Big data has opened up endless possibilities when it comes to deriving value from information, yet as is the case with all new technologies, there comes a need for an approach to security that keeps pace with current threats. Automating the management of M2M identities can result in significant business gains, such as time and cost savings. In addition to the immediate benefits, organizations also realize long-term ones in compliance and enhanced corporate reputation. Organizations looking to take advantage of all that big data is capable of, while simultaneously ensuring the highest level of security, should follow the best practices outlined above.
About the Author
Jonathan Lewis is director of product marketing for SSH Communications Security, where he is responsible for communicating the value and importance of effective Secure Shell access governance. Jonathan has diverse experience in the network and security industry including technical and business management roles at companies ranging from startups to global enterprises. His technology expertise includes VPNs, firewalls, SSL, SSH and DDoS mitigation. Jonathan holds BS and MS degrees from McGill University and an MBA from Bentley College.