The NSA scandal has brought metadata into the public eye. Particularly when combined with big data analytics, suspicion has abounded that metadata provides a particularly invasive window into the lives of individuals. A recent study conducted by researchers at Stanford has provided more evidence supporting these suspicions. But the implications go beyond simply the matter of government surveillance, also touching companies that wish to protect their customers.
NSA Spying Brings Fame to Metadata
As soon as Edward Snowden’s revelations about government spying began leaking, politicians began claiming that metadata is innocuous information that poses little or no threat to the privacy of innocent citizens. President Barack Obama rather famously said that “nobody is listening to your telephone calls.” He added that “the intelligence community is…looking at phone numbers and durations of calls. They are not looking at people’s names, and they’re not looking at content.” Others in the government likewise downplayed the matter, implying that because they didn’t hear the content of your call to a known bookmaker, they don’t know you’re a heavy gambler.
Even common sense, however, suggests that metadata can provide intimate details about a person’s life. Imagine, for instance, what information an IP address and a list of Google search terms might provide. An enterprising investigator could attach that IP address with a location—maybe a dwelling—and potentially identify the individual doing the searches and what that individual likes, thinks about, wants, believes and so on. Telephone metadata may be somewhat less invasive simply because it offers less information on the surface, but the potential for abuse is no less breathtaking.
You Are Your Metadata?
The researchers at Stanford collected telephone metadata from a number of volunteers using a smartphone app, which returned “device logs and social network information for analysis.” They concluded that “phone metadata is unambiguously sensitive, even in a small population and over a short time window. We were able to infer medical conditions, firearm ownership, and more, using solely phone metadata.”
In one case, the researchers noted, “In a span of three weeks, Participant D contacted a home improvement store, locksmiths, a hydroponics dealer, and a head shop.” Anyone familiar with the growing (and disturbing) use of so-called no-knock raids could easily recognize the danger of this kind of information in the hands of an overzealous law-enforcement agency and a judge too jaded to consider whether such a warrant is really justified.
Often, the identity of whomever an individual is calling is enough to roughly deduce a purpose of the call. For instance, chances are you probably aren’t calling a lawyer for a friendly chat, or a strip joint to ask about good restaurants in town. The frequency and duration of calls obviously indicates to some extent the relationship between the participants—this is the sort of information that the government often touts indirectly as a means to identify potential terrorists.
Combined with other publically available information and, particularly, other forms of metadata (GPS data, web searches, sites visited, email information and so on), this kind of information can make the actual content of communications all but superfluous. And when metadata is being collected on everyone in a population, those who either avoid electronic communications or exhibit seemingly random patterns become conspicuous targets in their own right, making any form of “opting out” impossible.
Metadata and Security
Of course, the study by the Stanford researchers—which simply gives a little more evidentiary weight to what was already a reasonable conclusion—points to the dangers of metadata when in the hands of an already unscrupulous government. But metadata can also be dangerous in the hands of private organizations as well. These organizations may not have the same nefarious purposes of governments (after all, the NSA still collects phone metadata despite admitting it’s all but useless for preventing terrorism), but if they collect it, they create a target for hackers.
Metadata analysis need not involve sophisticated software; a malicious individual could sift the information in some cases essentially by hand to find potential marks for crime. Thus, companies that maintain metadata repositories must beware of more than just well-funded groups. Securing metadata should take a priority that is at least as high as other kinds of private customer data. Protections could include encryption, isolation of databases from other parts of the network and limited access by employees.
Divorced from its wider context, metadata can be deceiving. Was that brief phone call to a suspected terrorist a signal to hatch a plot, or was it simply a “butt dial”? Is that change in calling habits the result of a psychotic episode or simply a stolen smartphone? Using more information, the situation can often be clarified, but it requires more information—in some cases, that means more invasion of privacy.
Inferences from metadata (or any data set) can also be inappropriate, which is the more likely difficulty that private companies will run into when analyzing data. Whether the results of the analysis are erroneous or inappropriate, however, the consequences for individuals can be devastating, particularly when governments are involved.
Metadata is sensitive—both common sense and empirical evidence point to this conclusion. In the context of government surveillance, objections by politicians that they are “only” collecting metadata are disingenuous. Not only does metadata enable reasonable inferences regarding intimate details of an individual’s life, it is also compact. Consider what would be easier: converting the content of millions of hours of conversations into text, analyzing that text for key words, combining those key words into recognizable patterns and then making inferences, or simply comparing data sets with just a few columns containing numbers. If metadata provides anything near the insight of the actual content (and it does), the computational savings are tremendous. In other words, the NSA has a greater ability to examine your life by looking at your metadata than by looking at the content of your phone calls. In other words, metadata is structured, whereas content is often unstructured.
Reassurances from politicians regarding citizen metadata notwithstanding, companies should treat their metadata as being just as sensitive as other kinds of data, ensuring adequate protection for customers. In addition, they should also recognize the limits of metadata: particularly that it can lead to incorrect or inappropriate inferences that can harm customers and, thus, the reputation of the company.