PoliticsTop Section

The Need for a Philosophy of Data

Yuri Zalevski

We live in a world where our explicit and theoretical knowledge is valued and compensated. The data scientists who develop more complex algorithms are rewarded handsomely for their work, academics are respected for their research, and entrepreneurs enjoy the fruits of their enterprises. But the vast majority of knowledge—all this tacit knowledge that people have—since it is dispersed and difficult to transmit, does not garner the same respect and reward. There are unrecognized masses who have helped to build the more productive technologies that have contributed to superstar firms far exceeding the returns of most conventional companies.

The existing model of data ownership then seems unsustainable. We see this in the backlash against tech giants in the wake of the Cambridge Analytica scandal, which has shaken the public’s faith not only in Facebook, but in the various companies to whom we hand over data. These companies were entrusted with the impossible task of being stewards of all the information they were provided, with there being no incentive for users to understand how their online lives functioned. Mark Zuckerberg and others in Silicon Valley are learning that it is far harder to regain someone’s trust than to lose it.

There already exist significant movements in Europe and in the United States that either want to regulate, break apart, or even nationalize social media and search platforms. None of these policy-oriented solutions fully understand what it is that has caused any problems in the first place. All of them, however, would be massive blows to internet freedom and innovation. As our online lives become more and more central to our professional and personal lives, a sound philosophy of data needs to precede any political or legal action.

A proposed philosophical shift that may help remedy the informational asymmetries and incentive problems that exist is to think of data not as capital, but as labor. This would in effect liberate individuals from losing the data product of their labor, the way that ending feudalism liberated individuals from losing the agricultural product of their labor. This radical proposal in a forthcoming American Economic Association paper by Imanol Arrieta Ibarra, Leonard Goff, Diego Jiménez Hernández, Jaron Lanier, and E. Glen Weyl, provides a deeper understanding as to where digital anxieties arise. To borrow from Marx, people have become alienated from what they produce, since the terminology surrounding data fails to acknowledge it as a process of production.

Fears over automation, data privacy, and the rise of a new class of technological elite are inextricably linked to this alienation. The online world is assumed to be owned by venture capitalists and entrepreneurs because the average individual sees himself as being sold a product, rather than being a digital citizen.

A change in thinking so that data is seen as labor would change the incentives of individuals and firms for the better. First, it would make the data produced online the property of the user who generated it, rather than being owned by a firm. Instead of seeing data as something exchanged for a service, data is something produced in an “online workplace”. This reorientation blurs the line between our online and offline lives. Currently, we think in a way that strictly separates our time spent online from the offline world, where we are meant to find dignity and leisure. Softening this distinction makes it understood that one’s productivity, dignity, and leisure do not need to be separated depending on whether bits are involved.

This cultural reframing is significantly different from policies such as the Right to Be Forgotten that exist in the European Union, which is more a cosmetic change to the end-user experience and the data portability rights coming into full effect in the EU this year, which grant you the right to move your data from a company’s servers, but do not to truly own it as your own.

Unlike the more limited changes currently being proposed, which could exacerbate compliance costs while being ineffective at best, a framework of data as labor would create the opportunities for new institutions to arise. Whereas currently the data of individuals is acknowledged as being worth very little, a data-as-labor world would allow for the creation of “data labor unions”, which would be able to bargain to free up some of the consumer surplus that companies have been able to capture.

Such changes in allowing for individuals to be more directly in control of their online product, would enable more effective remedies to collective action problems. There is a belief that network effects, the idea that a service becomes more valuable the more others use it, lead to lock in, preventing people from abandoning it. While the empirical data does not support this belief, it is possibly the case that it prevents change unless something egregious occurs. By making data a product of labor, rather than a good to be exchanged, users of social media sites could form collectives that act as online gatekeepers. They could then effectively organize strike actions that are much more difficult to do when data is seen as capital.

When it comes to automation, the largest change is seen from this change in thinking. Anxiety over artificial intelligence taking away jobs, and all of the economic resources being controlled by a small few are widespread worries. While the potential for technological unemployment is overblown, it is quite likely that it will exacerbate inequality in a world where data is treated as capital. Much of the current discussion on solutions to this potential inequality focus on redistribution or the creation of a Universal Basic Income. These ideas are based on the notion that the abundance that will be created by Artificial Intelligence can easily be given to those who have been left behind, and everyone will end up better off.

There is a disturbing paternalism to these proposals, as if a few geniuses built up the technologies that have created abundance, and nobody else. Those who are unproductive and contributed nothing to the world brought about by technological advance should then be subsidized to do nothing, while technologists continue making the world a better place. This may remedy one form of inequality, but reinforces another, and fundamentally deprives people of their dignity. Changing this attitude is in my view the strongest case for data ownership. Not the economic case, but the moral claim that a few should not take all the credit for what the many created.

A world with data as labor does not lead to technological unemployment, but a world of data labor. As there are less things that need to be done to produce abundance in the physical world, there will be more things needed to create fulfilment for all in the digital one. There seems to be no reason to deny that this is work, that it is valuable work, or that it is work worthy of compensation. Once this framework is adopted, the very notion of an automated dystopia becomes absurd. Something must be wrong with our view of the world if a technology that boosts productivity is viewed as leaving the majority worse off.

Posting something on Facebook and making your friends happy is no different from picking someone up in your car through Uber. You are contributing to the well-being of others and generating the data that further optimizes upon those pleasing experiences. These are not acts of exchange with a company, as much as they are through a company. Changing the way we view data may help us then make more sense of the “platform economy” that countless economists and business leaders have been scrambling to explain.

Jaron Lanier, a pioneer of virtual reality, and one of the authors of the proposal, noted that the current ownership structures of digital content result in a Zipf distribution, in which rewards are not as dispersed as the risk. This is unfair, but need not be so. The Big Data phenomena is new, and we have yet to fully build a philosophy of data. Seeing data as labor is one of the most thought-provoking attempts to develop this philosophy, though it is still one that is in its initial steps. Changing the discourse around data may help in making more sense of the political implications that this brings, but that change in discourse needs to happen first.