By Adam Gallagher, Media Analyst
The intelligence community hopes to start using today’s tweets to predict tomorrow’s threats. A recent NPR story drew attention to the attempts of private companies and government agencies to analyze social media content, and what happens when the analysis relies too heavily on automation.
The intelligence field is adapting to a world changed by social media, a world in which Big Data is even bigger thanks to the user-generated content social media enables. In this ever-growing mountain of information, intelligence leaders see an opportunity to read deeper into patterns that could reveal life-saving information. By analyzing the trends in the data, some believe events like the Arab Spring can be anticipated. Indeed, the NPR story reported the 2011 Yemeni Revolution was foreseen through social media analysis.
However, the analysis remains imperfect. One of the problems encountered by the intelligence community is one common to all media analysts: the algorithms designed to sift through the research lack the sophistication to determine the context, nuance and depth of the data. Human researchers, like the ones we use here at CARMA, can solve this problem, utilizing their cognitive abilities to discern exactly what a tweet, post or article is conveying. Methods that exclude a human element risk losing the accurate insights social media analysis can bring.
For instance, the NPR article described how one company predicted that the attack on the U.S. consulate in Libya would be traced back to a group in Yemen. We now know they were wrong, although their data was sound. The problem was the Yemeni group shares the same name as a group in Libya, confusing the algorithm, which lumped the two groups together.
This instance is only the most obvious example of the limits of automated research. Other limits include the inability to recognize tone, sarcasm or metaphor. And as anyone who has ever read a heated exchange on social media knows, sarcasm is prevalent, tone is nearly omnipresent, and a cap of 140 characters leaves a lot more said than written. To analyze the content of tweets based strictly on what is written and not by what is meant can lead to highly inaccurate results.
These limitations are the reason we believe so strongly in human-based research. A human researcher is able to distinguish the two groups based on the context of the tweet; the president of the company in NPR’s article admitted as much and assured the reporter that a human element will be involved in the process. By using human researchers to evaluate articles and utilizing sophisticated technology to aid in efficiency and quality, CARMA is able to provide accurate insights into important data. While it’s a new and exciting time for media analysts, human-based evaluation is an aspect of measurement that proves itself to be a valuable means of ensuring high quality analysis time and time again.