Free Future

Will We Mold Ourselves To Match Our Data?

By Jay Stanley, Senior Policy Analyst, ACLU Speech, Privacy & Technology Project at 3:42pm

I recently came across a very nice essay on “The Stupidity of Computers” by David Auerbach, which is really much more interesting than that truism of a headline might suggest.

Auerbach starts with the observation that computers “are the undisputed chess champions of the world, but they can’t understand a simple English conversation.” The point is a commonplace, almost clichéd one—but Auerbach quickly builds on it, slowly moving to a stunning punch line of a thesis that is thought-provoking and fresh.

Auerbach begins with an entertaining review of the limits of “semantic search.” In semantic search, computers attempt to parse the meaning of a query as a human would do: trying to break down the terms being used and the relationship between them, and proceeding on the basis of that “understanding.” He looks at the drift toward nonsemantic number-crunching approaches, in which computers’ ability to process raw data is used to blindly correlate terms and results. This shift is encapsulated in the difference between early search sites (such as Yahoo’s handmade categorization of online content and the Ask Jeeves search, in which computers were programmed to attempt to understand natural language)—and Google’s far more effective PageView algorithm, which just blindly crunches numbers (the structure of links on the Internet) and yet creates the illusion of much greater knowingness.

Ultimately, Auerbach points out, computers cannot understand natural language because it is shot through with ambiguities, and to sift through those ambiguities, and also to grasp the basic underlying meaning of a statement at all, requires nothing short of an understanding of the world and how it works.

Auerbach then discusses “ontologies,” which in the context of computer science refers to structured systems of information, or as Auerbach puts it, “a model of reality that is amenable to logical representation.” Ontologies are made up of elements such as categorizations, tags, hierarchies, taxonomies, and defined relationships. Unlike natural languages, ontologies are easy for computers to work with.

We see such information structures at work in many aspects of our online lives, Auerbach observes. One of the key things we do on Facebook is classify and tag ourselves—our interests, location, employer, school, relationship status, etc. Similar tagging is required to participate in online dating sites, and takes place on Amazon and many other sites. Even Twitter’s generally unstructured conversations have evolved the hash tag.

This brings Auerbach to his main point: “We will bring ourselves to computers.”

The small- and large-scale convenience and efficiency of storing more and more parts of our lives online will increase the hold that formal ontologies have on us. They will be constructed by governments, by corporations, and by us in unequal measure, and there will be both implicit and explicit battles over how these ontologies are managed. The fight over how test scores should be used to measure student and teacher performance is nothing compared to what we will see once every aspect of our lives from health to artistic effort to personal relationships is formalized and quantified.

We will increasingly see ourselves in terms of these ontologies and willingly try to conform to them. This will bring about a flattening of the self—a reversal of the expansion of the self that occurred over the last several hundred years. While in the 20th century people came to see themselves as empty existential vessels, without a commitment to any particular internal essence, they will now see themselves as contingently but definitively embodying types derived from the overriding ontologies. This is as close to a solution to the modernist problem of the self as we will get.

The fact that “we will end up accommodating the formalist methodologies of computer algorithms,” Auerbach suggests, will have a conservative effect:

The problem is one of ambiguity as much as nonneutrality. A reductive ontology of the world emerges, containing aspects both obvious and dubious. Search engines crawl Wikipedia and Amazon, Facebook tries to create their own set of inferred metadata, the categories propagate, and so more of the world is shoehorned into an ontology reflecting ad hoc biases and received ideas.

So from the cliché that computers are so smart and yet so dumb, Auerbach reaches the striking climax of his argument:

Because computers cannot come to us and meet us in our world, we must continue to adjust our world and bring ourselves to them. We will define and regiment our lives, including our social lives and our perceptions of our selves, in ways that are conducive to what a computer can “understand.” Their dumbness will become ours.

It’s a fascinating, insightful, and persuasive analysis. There are just two qualifications that I might make to the argument.

First, I would note that while computers are a relatively new technology, bureaucracies are not, and bureaucracies also require that people be placed in defined buckets and otherwise try to structure often fluid realities, and have been shaping human life for centuries. Historians, anthropologists and sociologists, for example, have analyzed at length how the emergence of clocks and factories in the European industrial revolution imposed a new structure and “time discipline” on daily life that is absent to this day in largely rural, undeveloped cultures, where a meeting may be arranged for “when the sun is somewhere over there in the sky” rather than for, say, 3:15 P.M. That said, it still seems plausible that, through computers, that kind of rationalization (and control) may be permeating our social lives and our definition of self more than ever before.

Second, I also suspect, as I have argued previously, that computers will actually “come to us and meet us in our world” more than Auerbach allows—that they will quickly become more quirky and unpredictable even as they become smarter, more flexible, and more accurate in their predictions and classifications. Perhaps in Auerbach’s terms that boils down to a prediction that nonsemantic data mining approaches to knowledge will continue to make inroads and take over more and more of the functions that so far are relegated to the structured ontologies he discusses.

How would that affect us? While Auerbach’s Ascent of the Ontologies will flatten and coarsen our interactions and our definitions of self, the vision I have warned about is that the rise of unpredictable nonsemantic data mining will turn us into “quivering, neurotic beings living in a psychologically oppressive world in which we’re constantly aware that our every smallest move is being charted, measured, and evaluated against the like actions of millions of other people—and then used to judge us in unpredictable ways.” Auerbach’s vision is one of control—including the illusion that we ourselves possess control (as when we cheerfully categorize ourselves for the world on Facebook)—permeating our beings, while I warn about out of control algorithms, albeit ultimately serving the purposes of large organizations and their impulse toward control.

In the end, the social effects of ontologies and of data mining are not mutually incompatible and perhaps are even mutually reinforcing. Perhaps the only difference is that Auerbach focuses more on how we will place ourselves into ontological buckets, with all the implications that will have for the modern soul, while I am more focused on how computerized data mining algorithms will do that for us, with all the civil liberties implications that will bring.

Statistics image