Big Tech and the Colonisation of Data

Opinion Pieces>Big Tech and the Colonisation of Data

When you start a business, you start by solving a problem. If your product or service can meet your customer’s most pressing needs in a unique and effective way, then the business that you build around this premise is all the more likely to succeed. And this idea, to start a business by solving a problem, was just as pertinent 150 years ago, when in 1854 an ambitious French entrepreneur dreamt up a wildly ambitious business idea that he believed could change the world. The businessman was Ferdinand de Lesseps and the problem he had identified was the fact that for European ships to get to the Far East to trade with strategically important and resource-rich nations such as India, they had to make the long and treacherous journey around the southern-most tip of Africa and back up again past the Horn of Africa. The solution he proposed was to cut a path of nearly 200 kilometres through the Egyptian desert to create a waterway between the Mediterranean and the Red Sea. This waterway was to be the Suez Canal.  

Having convinced the ruler of Egypt at the time, Sa`id Pasha, to grant a concession for the Canal to be built, Ferdinand de Lesseps successfully negotiated the privatisation of a critical passageway from Europe to Asia that would change the shape of the global economy forever. On 17 November 1869, the Suez Canal opened under the control of the Compagnie Universelle du Canal Maritime de Suez – the Suez Canal Company. In conjunction with the completion of the American Transcontinental Railroad just six months earlier, the opening of the Canal had an immediate and dramatic effect on world trade, facilitating the movement of goods around the globe in timeframes never before realised. What this meant was that whoever controlled the Suez Canal effectively became the gatekeeper of intercontinental trade, with that single waterway yielding untold power to its possessor.   Big Tech and the Colonisation of Data 

Much in the same way, the flow of data in today’s digital age has become the critical waterway that facilitates the mode of modern business. And the contemporary gatekeepers of this new data canal are “The Four”, as Scott Galloway calls them in his book The Four: The Hidden DNA of Amazon, Apple, Facebook, and Google (2017). The global dominance of these tech giants has been growing for years, but only recently have the public and lawmakers started taking notice of the near-unchallengeable nature of these companies and the centralised control they have secured over the world’s digital data. The above-the-law and unapologetic attitude of Big Tech was perhaps best captured in the rather soulless and disturbingly robotic display of Facebook’s CEO, Mark Zuckerberg, when appearing before Congress in April of 2018 to answer for his company’s massive data leak of millions of users’ information to the nefarious and politically manipulative data analytics firm Cambridge Analytica.  

To grasp the extent to which these giant tech firms have dominated their industries and centralised power, Galloway notes that together Facebook, Google, Amazon and Apple have a collective market capitalisation equivalent to India’s GDP, but a population size equivalent to the Lower East Side of Manhattan. Even more worryingly, many experts believe that the newly-bourgeoning field of artificial intelligence will only serve to further concentrate power in the hands of these four tech giants. And the enormous significance of AI for every conceivable industry is something that is certainly not lost on the leaders of “The Four”, with Google’s CEO Sundar Pichai believing that “AI is one of the most important things humanity is working on. It is more profound than, I dunno, electricity or fire.”  

Recognising that AI will result in sweeping changes to the way the world works in the next few decades was the first step that Big Tech needed to take to consolidate their dominance in the digital world. From that point on, it was inevitable that the best minds in AI would naturally flock to these hubs of power, thanks to several factors that are essential to the development of the most advanced artificial intelligent technologies. The first of these factors is the scarcity of true AI experts across the world, a fact that has made these experts extremely valuable. Establishing a team of these specialist computer scientists is subsequently near-impossible in financial terms for corporation aside from the very biggest of the tech firms.  

The other critically important factors that feed this concentration of power are the necessity for extremely powerful computing resources and the very large data sets required by the most promising artificial intelligence methods. One such method that requires an inordinate amount of resources in terms of data and processing power, for example, is the use of unsupervised learning in neural networks. As opposed to most supervised methods, which can use smaller sets of training data that prescribe the response variable as the correct outcome for the network’s output, an unsupervised neural network does not use a predefined training set as a guide, but rather relies on large volumes of data to refine its own outcomes through the continuous refinement of predictions.  

In fact, it is this flood of new, rich data that has been created in recent years – and not just the improvements in processing power – that has allowed AI deep learning processes to experience a new dawn. Without the pervasive yet extremely effective information-gathering techniques used by Big Tech, the unsupervised deep learning systems that require this extreme depth and breadth of data to prune and perfect their neural networks – much as the human brain strengthens useful synaptic connections while pruning unnecessary ones – would not be able to function optimally. And whilst this growing body of personal and public information has effectively facilitated very promising advancements in AI, the problem is that in capturing and storing this immense data, a few giant corporations effectively control almost all of the data enabling advancements in deep learning.  

Each of “The Four” capture this critical resource of raw data that feeds their continued growth in quite different ways. Galloway breaks down these data gathering methods using a metaphor relating to cravings of the human body, namely Google as appealing to the mind, Facebook appealing to the heart, Amazon appealing to the stomach, and Apple to the genitals – or as he puts it, “Google is God, Facebook is love, Amazon is consumption, and Apple is sex.” And when appealing to the most basic human desires, Galloway notes that these companies do not need to drag the data out of their customers, but rather it is freely handed over in almost every action made by their adoring and overly-trusting users.  

Let us contemplate Google for a moment. In every passing second, Google processes an average of 40 000 searches, totalling 3.5 billion searches per day, capturing a share of over 80% of searches worldwide. To envisage how much data this company has collected since their inception is disturbing. In fact, from the outside, market participants and lawmakers are unable to know for certain exactly how much data Google manages to gather, given the company’s very secretive nature towards disclosing the amounts of data they actually store in their data farms. One approximation calculated by Randall Munroe, using conservative estimates garnered from publicly disclosed information such as electricity usage, square footage of data centres, and various expense reports, totalled Google’s stored data at around 15 exabytes – or 15 billion gigabytes – an estimate that is many times larger than any other company’s data footprint, even larger than that of governmental organisation such as the NSA.  

It is not only the sheer incomprehensible scale of this mountain of data that makes it so incredible, however, but it is also the level of detail and sensitivity of the information that makes this data so unique. Of those 40 000 searches per second, for example, imagine how many are of an extremely personal nature, such as “How do I get over a breakup?” or “What is the best cream for a foot fungus?” Like a digital deity, Google hears every prayer and request, whether it be good, bad or ugly, and catalogues each according to place, time, frequency and hundreds of other possible categories that will increase the value of each and every bit of information gathered.  

For Facebook, despite their recent security faux pas, our need to feel and show emotions such as love, anger and sadness on a public forum sees the platform attracting 2.1 billion monthly active users, all sharing their most private and intimate feelings across the network. Simultaneously, Facebook’s baby sister Instagram gathers data by the fistful, with regard to ever more analysable images of ourselves and our friends that aid in the company’s quest for the best facial recognition software. Amazon in the meantime, as the stomach of the tech body, knows consumers’ every insatiable desire, including when, where, and how often they need their fix. And Apple, the first ever company to surpass the trillion-dollar market capitalisation threshold, holds its data so dear that even when pressed by the FBI for help in a terrorist investigation, the tech giant still refused to unlock the iPhone in question.  

As with the early success of the Suez Canal Company in the late 1800s, Big Tech has in the last decade or so exploited the underdeveloped, fragmented and often wholly inadequate data and digital privacy legislations that have existed until recently. In fact, the Suez Canal Company benefitted tremendously from the loose, colonial-style agreements which allowed them to monopolise the critical waterway for many years. Whilst the public and the State have recently awoken to the fact that Big Tech corporations are hoarding massive amounts of user data on their servers, there is no way to wind back the years of plunder that have already occurred. This act-now-and-ask-permission-later mantra of these giant firms has effectively put them in a position where any competitors attempting to enter the market will start with a handicap the size of a decade’s worth of data.  

Whilst some may see these companies as brilliant pioneers in their industries, one could also make the argument that they are robber barons disguised in the jeans-and-t-shirt uniform synonymous with Silicon Valley. Taking advantage of the lax operating environment they have found themselves in, these companies can be seen as modern-day colonists of the digital wilderness, planting a flag in every territory where they find valuable resources and drawing up imaginary borders to divide and conquer the somewhat bewildered populations that reside in the newly colonised lands. But as history suggests, no such monopolistic dominance of economically critical resources exists for long without a violent rebuttal.  

In the case of the Suez Canal, it is estimated that at any given time during the 10-year construction period there were 30 000 labourers working on the Canal, with more than 1.5 million people from across the world working on the project in total. Most of these workers, however, were corvée – unfree, unpaid Egyptian labourers who would never see a penny of the wealth that their work would create. The history of the canal’s contentious construction was further aggravated in 1956, when conflict erupted after Egyptian President Gamal Abdel Nasser nationalised the Suez Canal, in what was later seen to be a brash and irrational decision by the hot-blooded ruler. In response to the nationalisation and the subsequent blockade against their ships using the canal, Israeli forces invaded the Sinai Peninsula. This action triggered then Anglo-French troops to step in, believing a full-scale war between Egypt and Israel must be avoided at all cost. The resulting conflict and Nasser’s deliberate targeting and sinking of foreign ships in the Canal became known as the Suez Canal Crisis and caused its temporary closure in April of 1957. With the crisis promising to escalate even further and world trade already thoroughly disrupted, the decision was made to set up the first ever international peacekeeping force – UNEF – under control of the United Nations, to ensure stability and safety for all using this globally strategic waterway.  

The comparisons that can be drawn between today’s data domination by Big Tech and the turbulent history of the Suez Canal should serve as an important historical lesson, albeit one that played out over 150 years ago. The monopolistic control of unfathomable amounts of public and private data of everyday citizens is problematic in many ways, not least of which is the power that this information holds in a world that will soon be radically changed by the rapid advancement of artificial intelligence in every sphere of our lives. The fact is that these companies will not willingly release their mighty hold on this data, which means that at some point their iron grip may have to be forcefully broken. Whether this will happen peacefully or violently is hard to tell, but if history is to be the prophet of the future – as it so often is – then Big Tech may soon experience its own form of Suez Canal crisis.

This article is from the Monocle Quarterly Journal, Deep Learning. Visit our "Journals" section to read the full issue.