In the 21st century, the most important weapon are no longer missiles, tanks or fighters. It is information. Present-day technologies allow for quite accurate forecast of the probability of attack or war outbreak.
This photograph (“Situation Room”) was all over the world in May 2011. The room was full of American officials. Serious faces, all eyes riveted on the invisible screen. In the background, President Barrack Obama, sitting and slightly leaning forward, on his left – Vice President Joe Biden, on his right – Secretary of State Hilary Clinton, covering her mouth with her hand in the gesture of emotional strain. They are all watching live updates from operation conducted thousands kilometers away in Pakistani Abbottabad by the elite Navy SEALs, the moment of their getting into Osama bin Laden’s hideout and when after exchange of fire, they kill the most wanted terrorist in the world. The image is live-streamed to the White House from the cameras on soldiers’ helmets.
The Americans took Osama at a shot back at the end of the 1990s. At that time, Al-Qaeda leader organized terrorist attacks on US embassies in Kenya and Tanzania. After the 9/11 attacks in 2001, President George W. Bush said: “I want him dead or alive.” Special CIA agents for over dozen years were collecting pieces of information and clues. Finally, they had got on the trail of a courier who led them straight to Obama. Several years ago, Cynthia Storer and Nada Bakos, members of the special team for uncovering the connections of Al-Qaeda leader, openly admitted that capturing bin Laden would not have been possible without Big Data analytics. Reportedly, CIA was at the time using the Gotham tool, created by Palantir Technologies. Gotham made it possible to fish out crucial information from the sea of data flooding the global network.
Many details of this operation will probably never be revealed. However, we can definitely assume that the Americans used intelligence information of all possible kinds. Most certainly the data of Cyber Human Intelligence category, which are acquired through analysis of human activity in social media and on various fora, were of considerable value. From there, there’s only one step to open-source intelligence (OSINT), i.e. collecting and analyzing data from public sources.
Signals intelligence (SI) could not be omitted either, as it allows for registering the activity of connection resources, such as radar stations or phones, as well as geospatial intelligence (GI), which is about images and recordings from drones or spy satellites. Information from this category could have been of considerable significance in the final phase of the operation.
The residence, where Osama bin Laden was hiding, was not connected to the Internet, and its residents were trying not to use their phones. However, a wide analysis of accessible materials revealed certain regularity. From the high-walled property nobody ever took the trash out. Osama people would always burn the trash in the backyard, so as even the smallest thing would never allow for identification of the main resident…
“In the questions related to security, Big Data analytics is used at the tactical and operational level. Capturing Osama bin Laden is a perfect example here,” emphasizes Krzysztof Liedel, PhD, Collegium Civitas expert for terrorism and data analytics. “But Big Data is also to help in forecasting future, or even in making strategic plans.”
Zettabyte Era
As a result of technology advancement, information has become in contemporary world the most important, and at the same time the most common good. Modern communication ways produce enormous data quantities. “When in 1997 I bought a modem to connect to the Internet, the entire network resources was not bigger than 30 GB. Twenty years later such data quantity was generated in a second,” says LtCol Rafał Kasprzyk, PhD in Eng. at the Faculty of Cybernetics at the Military University of Technology. Every day on Facebook, on average 4.3 billion posts and 350 million photos are posted, and on Twitter – about 656 billion tweets. It is said that only today an average user generates 1.7 MB of data in every second of his activity.
Such enormous data sizes need to be defined by larger and larger unit bytes. In September 2016, the American IT company – Cisco Systems – declared the beginning of the Zettabyte Era. To make it clear: zettabyte is a unit byte that equals sextillion bytes. This is equal to trillion gigabytes and thousand exabytes. What’s the scale of magnitude? The Cisco specialists explain it this way: one exabyte can contain high-definition video files, the playing of which would take 36 thousand years. Also, the entire Netflix catalogue could be streamed over three thousand times.
Where does that massive amount of data come from? The main source is of course human online activity. Flourishing is also the so-called Internet of things, which includes almost any device or piece of equipment with Internet connection and ability of data transfer, such as connected to the net refrigerators, TVs, smartwatches or even intelligent clothing full of electronics. Going further, we come across the idea of the so-called smart buildings with remotely controlled devices, lighting, central heating or alarm systems. There are also login signals of GSM devices, geolocation of various GPS receivers and transmitters, debit card transaction data, but also sensors for weather forecasting.
Data Warehouse
In order to process such enormous data sizes, state-of-the-art methods and tools are needed. Most of all, before analyzing, all collected data must first be prepared. This is the role of data warehouses. They have been existing in the civil market for years, offering its services, for example in business, industry, insurance company market or in medicine. They allow for effective data exploration: from the real ocean of information, they fish out certain dependencies, tendencies, regularities or quite contrary – irregularities.
Such analytics requires devices of powerful performance, capable of dealing with such masses of data within a reasonable period of time. As data sizes are growing immensely, computer performance must also be increased. Apart from that, the so-called quantum computers are now being developed, the performance of which will be incomparably better than of traditional computers, and of which the main role is to be Big Data processing.
“A human being is not able to handle such masses of data on his own. He must get support from advanced analytical tools. It’s obvious. The question is, to what extent we can become dependent on algorithms? How much we can trust them?,” wonders LtCol Kasprzyk. Present-day computer programs are now more often than they used to be capable of self-learning. To make it simple: when facing new challenges, they find solutions, and then they make them generally applicable to use it in the future. “The mechanisms ruling the process are getting more difficult for us to interpret or sometimes they’re simply not interpretable,” admits LtCol Kasprzyk.
There are tools, which have been created to facilitate access to these masses of data, so they can be used. One of them are heat maps, which are graphic representations of data where the intensity of an individual value is represented as colors, just like in thermal vision. It’s about making an image of specific situations, so they can be later interpreted. Heat maps are used for visualizations of, for instance, business operations or for revealing the public transport bottlenecks, there are even currency heatmaps.
It seems obvious that this ocean of data is used in multiple ways, for instance to manage municipal space. Based on millions of images, which the Internet users post on their FB accounts, it can be defined very precisely, which areas, markets, streets or statues are most frequently visited. This, in turn, may be helpful to identify localizations where, e.g. public transport stops are most needed. It also becomes possible to control city traffic so it’s smooth and with no congestions, or energy management so the street lights do not illuminate empty squares at night. The so-called smart cities are the result of it, where the public space is adjusted to the needs of the locals.
We Know Everything About You
It’s no surprise that Big Data arouses vivid interest also among service providers responsible for data collection. “Do you remember Minority Report movie?,” asks LtCol Rafał Kasprzyk (WAT). In Steven Spielberg’s movie, Tom Cruise is starring as John Anderton who works as a leader of the PreCrime police unit using a precognition method for crime prediction. John Anderton always appears in places where crime is just about to be committed, and he prevents it. “Today, this vision is becoming reality,” says LtCol Kasprzyk. Although of course scientists, trying to predict events, do not use parapsychology, but hard data. “The Police Department in Memphis, for instance, several years ago implemented a pilot program based on special IBM tools. Based on the data on the crimes committed in the past, the IBM Predictive Analytics Software shows places potentially threatened, and the police is patrolling them. The crime rates in this town were reportedly reduced by about 30%,” explains LtCol Kasprzyk.
The Big Data analytics also means the future in global security. In this aspect, however, there are also many traps. “Intelligence services face the challenge that is in short called 4V: velocity, variety, volume and veracity,” says LtCol Kasprzyk.
The Americans used Big Data during military missions in Iraq and Afghanistan. Eli Berman, Joseph Felter and Jacob Shapiro, the authors of the book Small Wars, Big Data: The Information Revolution in Modern Conflict, write that the US Army analysts marked all events related with terrorist attacks on soldiers. They were extremely precise about that. The collected data were initially going to be juxtaposed with the economic rates for the conclusions to be drawn about detailed effects of military interventions. Soon however, in the army a special team was formed which, based on collected information, was trying to predict the possibility of attacks on patrols in certain locations.
Big Data analytics is also to prevent problems on macro scale. All thanks to such tools as Global Database of Events, Language, and Tone (GDELT). This powerful system, created by the team of scientists led by Kalev Leetaru at Georgetown University, collects all kinds of news on conflicts, demonstrations, wars or seditions, starting from 1979. In order to do that, it’s tracking thousands of news websites. Collected data are then processed with the use of special algorithm, and published as reports. As the creator of the tool explains, many wars would ripen in similar way, accompanied by very similar emotions and occurrences. From the thick of information some signals can actually be fished out, and worth taking a closer look at. GDELT is not the only such system. For some time now, DARPA, the US government agency dealing with the growth of military technology, has been testing a computer Integrated Crisis Early Warning System (ICEWS).
The Big Data revolution has been appreciated by the services in other countries, too. Aharon Ze’evi-Farkash, a former head of the Israeli Military Intelligence Directorate (Aman) said in May 2018 that each day the agency must deal with three billion bytes of information. Behind those numbers, there are data which are very helpful in neutralizing potential threats and in increasing state security.
One of the issues crucial to services is identification of people, locations or things. In common use, there are, for instance, specialized programs for identification of license plates of various countries. Posted online photographs are being used more and more frequently for identity identification. Having at the disposal millions of images, facial recognition systems allow for identifying criminals, terrorists or dangerous people, e.g. enemy agents. Such systems are to be (or even already are) also installed in autonomic drones which eliminate targets identified by artificial intelligence.
Personal Profiles
Throughout the world, social media are enormously popular, as their users are still in contact, chatting, recording videos or sharing images. The average Internet user quickly forgets about his clicked likes, as well as about his posted comments or posts. These however never disappear, just like all posted online photos or videos: all of them are being kept to the record in the system, catalogued and archived, thus making a so-called digital footprint.
Based on shared online data, it has become possible to create a psychological portrait of each user. Michał Kosiński, IT specialist and one of the authors of an algorithm which describes profile models of social media users, said in one of his interviews: “Based on 70 to 100 likes, my algorithm is capable of learning as much about the user as his family can learn about him. With 250 likes, the system will know him better than his spouse.” Alexander Nix, a former CEO of Cambridge Analytica, claimed at one point that his company is able to precisely define a personality of each adult in the country.
This leads to abuses and manipulations. In January 2012, almost 700 thousand of unaware Facebook users were tested experimentally by the company, the fact which was later recognized as unethical. Facebook, in order to study the responses of its users, shared news that had been previously modified in a certain way. One part of the users were receiving (via algorithms) a positive emotional content, the others were getting the negative one. This experiment revealed certain dependency: by putting in the users’ heads some news of certain emotional tone, they could influence also the tone of their posts and commentaries that followed. External commentators were of the opinion at the time that Cambridge Analytica had committed forbidden emotional manipulation.
Cambridge Analytica was accused of manipulation on a wide scale; the company had been engaged in providing to online users propagandist and manipulated content, allegedly commissioned by Russian authorities. It is believed that this situation affected, among other things, presidential elections in the United States and influenced the final outcome of Brexit referendum in Great Britain.
Skeptics notice that with the origin of the Internet, intelligence services received almost perfect tool for invigilation and manipulation purpose. Collecting and analyzing the greatest possible quantities of data arouses opposition in democratic societies. In 2013, the New York Times published the news, widely echoing, which said that the US National Security Agency (NSA) has been tracking its citizens through social media. Edward Snowden revealed documents showing the scale of online invigilation by means of numerous programs supervised by NSA. One of them is Prism, which opens to American intelligence services the access to all content collected in the server rooms of the largest American communication companies, such as Microsoft, Yahoo, Google, Facebook, YouTube, Skype, AOL and Apple. In addition, there are other programs, such as Upstream, which intercept electronic communication. “We do what we have to in order to prevent America from terrorist attacks,” said then the agency’s representatives.
A dynamics of world conflicts has been changing in front of our very eyes. “It seems that the epoque of classic wars, with two armies clashing on a battlefield, is slowly fading into a past. Today, the hybrid activity is much more advantageous, because from the point of view of an aggressor, the risk of potential losses is much lower,” he adds. Here comes a question: are analytical systems capable of fishing out information that would show us that we will soon become victims of a brand new type of attack?
The significance of Big Data in security issues will be growing. This is related with the advancement of technology and changing face of the world, also if it comes to a dynamics of present-day conflicts,” claims Krzysztof Liedel, PhD. “Quite crucial will be particularly the skillful analytics of the so-called open sources. The rule of reverse triangle applies here: 70%, or even 80% of useful information are data publicly shared and easily accessible.”
autor zdjęć: Dział-graficzny MOPIC / FOTOLIA
komentarze