Computational Complexity in the Chinese Room
» Downloadable RTF of this piece
Can computers think?
Many technologists and computationally-minded philosophers believe that it is possible, perhaps not today but eventually. Some, however, think it is not, and John Searle is one of these. Searle has inveighed against the notion of a thinking digital computer over many years, and his “Chinese Room” argument is a classic thought experiment designed to illustrate the fallacy of this idea.
Searle asks us to imagine ourselves seated in a sealed room, with “input” coming in from one source outside the room—perhaps through a slot, teletype, or display screen—and the ability to send output through another. The input is nothing more than incomprehensible squiggles; in fact the squiggles are meant to be Chinese logograms, but since we don’t know the first thing about that language they are just meaningless gibberish to us. In any case, we receive the input squiggles, and open a massive, heavily-indexed reference tome, wherein we scrupulously track down each squiggle and find a matching squiggle of another sort. These we record on the output feed and send it on its way, having performed some sort of translation that is altogether opaque to our understanding.
On the outside of the room, however, what has happened is this: the room is being treated as the subject of a Turing test. Interested parties are typing in questions—in Chinese, as it happens—and receiving answers—in Chinese. If our system is of good quality, over the course of this conversation they may be convinced that the room, or something inside it, is intelligent, and thus passes the test. But this is an error: I do not have conscious states that exhibit any sort of understanding of the questions I am receiving. It’s all just squiggles to me. It seems, therefore, that the Turing test is not a reliable way of ascertaining true thought, and moreover that any machine exhibiting such a “formal” architecture, no matter how complex, could never be called intelligent in the way that we mean. Certainly it might simulate intelligence impressively, but this is precisely the problem, since it means only that we have an automata that is extremely good at fooling our test.
The critical point here is Searle’s contention that no matter what may happen, I still will not understand any of the Chinese. He takes this to broadly mean that formal architectures, such as our weighty look-up table, can never produce understanding, because real thought requires semantics—meaning—whereas the book gives us only syntax, or relation. Thus, if it can be shown that such a device can exhibit semantic understanding (this would be something along the lines of knowing the “meaning” of each symbol or group of symbols), it would counter the argument handily. Searle thinks this is impossible, but I disagree. In fact, I think his example alone must feature this trait to perform, and is thus self-defeating; I will try to illustrate this as follows.
First, it is important to consider exactly what sort of book this table of ours must be. To start with, it must be capable of flawless translation. This can be understood in the following manner—the book’s functionality could be duplicated, making Searle much happier, if it used three steps instead of two, translating first from Chinese to English, soliciting our response, and then translating our answer from English back into Chinese. Already, this is really an incredible feat! The best automatic translations available today are ham-handed at best, lucky to accurately convert even simple sentences and quite unable to convey the subtleties of meaning that we would need in our Chinese Room. But in this case, the book must not only produce nigh-magical translations, but must actually do more—because here, we have been eliminated, and the book has taken over our role. So now, the book is not only the best sort of translator, but also, essentially, a “thinking device” that can intelligently hold a conversation at the same level of knowledge and fluidity that a human could. How remarkable!
So our goal now is to show how such a system could exhibit semantic understanding rather than mere syntactic translation. To do so, let us consider the program in action. Suppose that the testing committee outside our room has fed in the following question (in Chinese, naturally): “What is your favorite flavor of ice cream and why?” The answer could not be some Chinese version of “vanilla, just because,” since this would not convince any skeptical interlocutor, at least not on the necessary follow-up interrogation. Rather, we would want to say something like “I love vanilla. Maybe it’s because there’s just something enjoyable about such a clean, simple flavor. But really, I’ll bet you it’s because when I was a boy, my father used to take me out every Sunday to the local ice-cream shop, and he’d buy me a vanilla cone, and we’d walk around downtown and talk. Those are some of my best memories.”
My goodness. In order to produce such an answer, we will not only need to know about many flavors of ice cream. We will also need to know what they taste like, and to have experienced “clean, simple” things in order to draw a metaphorical sort of comparison between them, and know that clean and simple things have a certain kind of appeal to most people, although the appeal is somewhat ineffable. We will need to know that as a purported adult male, we were once a young boy, who probably had a father, and that taking his son out for ice cream on a weekend is the sort of thing a father might do. We will need to know that ice cream is sold in shops and served in cones, and that ice cream shops are frequently open on Sundays, and that “downtown” is a place where families might walk. In fact, we will need to have memories, at least pretend ones, and the power of value judgment to establish that some of them are “good” memories, and enough in common with human beings that the sort of memory described might be judged one of the good ones.
It might be suggested that all of the above could be “faked,” such that rather than all of this complicated background the programmer might simple have anticipated the question and stored the answer as a standalone, meaningless string of text. But this is not the case, not in any plausible sense, because while we might pull off that trick a few times, in order to engage in an entire conversation of the kind an aggressive committee of skeptics would employ, there would be a vast number of questions like this, many of them interrelated and spinning off into their own lines of discussion, and each one of our answers must mutually agree or we will be discovered. Predicting every possible question would be altogether impossible.
Imagining such a conversation, it is in fact hard to imagine anything that must not be contained within our database in order for us to intelligently present ourselves. We must really know everything that a human would know about human life, and know it in the same way that humans know it—for instance, programming in the chemical structure of vanilla would probably not be sufficient to produce knowledge of the “clean, simple flavor” described, or to speak about it at any great length; we would have to instead know the subjective taste qualities that vanilla produces in humans (I steer clear of using the word “qualia”), which would require either an amazingly elaborate set of propositional facts, or more plausibly, would simply demand that our room included sensory apparatus. If our room could see and hear and touch and taste and smell, with all of these data (appropriately indexed and cross-referenced) piped into storage as memory, that would make our job as programmers much easier. (These memories would, of course, be stored merely as contextual rulesets or in some other formal means.) This also requires the ability of the system to route information from module to module for different uses, translating it from one form to another and re-weaving its network of implications and associations. (For instance, we must be able to move the sensorily-input vanilla to the memory-stored vanilla and then out again to the described-in-words vanilla, along with whatever other relevant tidbits it has become attached to.)
Perhaps our room would even need the ability to perform actions in the world, because otherwise it seems hard to imagine how to teach it what it’s like to hit a baseball, buy groceries, win a trophy, or shoot a burglar. So let us wire the controls of our room to a walking automaton somewhere, and thus give it the ability to affect the world as well as being affected by it. To function with any utility in the real world, we would need a kind of “sandbox” of current inputs and short-term memories, a RAM-like “consciousness” that allows constant revision and manipulation, with access to both retrieved memories and ongoing input and processing.
Now what do we have? We have a system that receives sensory input and produces physical output. It stores a colossal index of information, not just dumbly but in an adaptable matrix of mutual relevance and relation; some nodes relate to other nodes, some relate to our input or output apparatus, and so on. We have enormously powerful processors to perform all these actions in real-time, or perhaps more reasonably, an enormous array of smaller parallel processors, since most of the tasks at hand can be described as large bundles of fairly simple computations. (The burgeoning field of connectionism describes one plausible way to produce this sort of architecture, using methods borrowed from the human brain.)
At this point it is next to impossible to think that a system of this power could be created using the pedestrian methods that Searle intimates—something like a paperbound book, a pencil, and some scratch paper—but although it might be like describing every atom in the universe using prose poetry, it is true that it is conceivable, at least in theory. But no matter how it is constructed, it now seems far from obvious that the thought experiment is still doing the work it was intended to do. Searle lays out the requirement of semantic understanding as a necessary condition for thought, and suggested that formal systems could never achieve this. But what more semantics do we want? “Vanilla” is now not only a word; it is a concept with enough informational and causal associations to keep us busy for a week. The book knows what it means. Perhaps it is hard to imagine how this could constitute “understanding” in the so-called mind of our “book,” but then, it is Searle who posited such an unlikely set of tools. It is hard to imagine how molecules could constitute the New York Stock Exchange, but there you go—and if Searle wants to limit our technologies to ink and paper, it is he rather than the critics who must deal with the system’s non-intuitiveness.
Searle might, at this point, indicate that by invoking a Rube Goldberg array of machinery, I have moved beyond the initial thought experiment; perhaps he would agree that this device can think, but would suggest that no such machine could be built except by novel and miraculous technologies. That may be—but without the tools and complexity described above, the room would never pass muster in any legitimate Turing test. Either the program is possible, and the Chinese Room can think, or it is not possible, and the Chinese Room will fail the test. We must learn which is the case empirically—but either way, the room proves nothing.