You are here: http://dunne.yenn.ulegend.net//semiotics-and-gui-design.html
Go to: index | blog | contact
*Originally published as "Excessive Realism in GUI Design: Helpful or Harmful?"
A debate has been raging for some time about the "best" style of user interface. At present a style, centered around the notion that users should know as little as possible about computers, and featuring graphical user interfaces, or GUIs, has carried the day. But is this the optimum situation for computer users, or has the marketing hype behind the GUI juggernaut obscured some of the real issues involved in this question? It is my contention that it has, and this article will employ a somewhat philosophical perspective, involving the use of semiotics, to highlight what those issues might be. At the end, we will descend from the empyrean heights of philosophy and look at the implications of these issues for today's workplace, and for the power structure of our organizations.
Semiotics, the study of signs, dates back to at least the ancient Greek physicians, most notably Hippocrates and Galen. They detected a common thread when confronted with phenomena as different as a patient who said, "My stomach hurts," and a distention of a patient's stomach. Both events seemed to point to a something beyond themselves — in these instances, a malfunction in the patient's digestive system. They called events of this type semion, or signs. In the fourth and fifth centuries A.D., St. Augustine formulated the first general theory of semiotics. In On Dialectics he defined the sign as: "A sign is something which is itself sensed and which indicates to the mind something beyond the sign itself. To speak is to give a sign by means of an articulate utterance."
The Scholastic philosophers of the Middle Ages, working from the classical foundation, formulated an enduring definition of the sign process as aliquid stat pro aliquo (that which is there stands for that which is not).
In the nineteenth century, the work of American polymath Charles S. Peirce elaborated on the Scholastic theory of signs, positing the sign as "something which stands to somebody for something in some respect or capacity": the sign represents its object to an interpreter. He also developed a "trichotomy" of the ways in which a sign relates to its object, naming the three modes index, icon, and symbol. (Umberto Eco, amongst others, has argued for a more comprehensive n-chotomy system of classification, but for our purposes this three-part one will suffice.) It is with this trichotomy in mind that we will analyze the design of user interfaces for computing machines.
But before examining the user interface, we need to understand the division of our trinity. An index is related to its object by a direct physical connection. A bird's tracks in the snow, the smoke rising over a hill that indicates a fire in the next valley, a finger answering the question, "Which box?" by pointing at one, and the use of a (computer) mouse to point to a document are all examples of indexical signs.
An icon is related to its object by likeness. A silhouette of a deer on a yellow diamond by the roadside that signifies a deer crossing, the skull and bones on a bottle of poison indicating death, the photo of an entrée in a menu, and a paint brush on the tool bar of a paint program are all primarily iconic in nature.
A symbol relates to its object by convention only. The word /bear/ is only associated with the big furry animal by convention — there is no reason why it couldn't have been called /tree/, or /golf/. (I will follow Eco's convention of using slashes to indicate something intended as an expression and guillemets to indicate something intended as content. Thus, /bear/ is an expression referring to a <<bear>>.)
The primary example of symbolic communication is spoken human language, but there are others: the fish standing for Christ, the commands of a programming language, or mathematical symbols. (Peirce points out that mathematical equations, in contrast, work primarily by likeness of form, and are therefore iconic.)
These three modes of sign production are usually intermingled in a single sign, but one mode will dominate. A large statue of a bull stands in the middle of Broadway, in the midst of New York city's financial district. At first this might seem to operate as an ico— it looks like a bull! But because of its location (the indexical mode at work), the bull chiefly operates as a symbol for Wall Street, which it is associated with mainly by convention.
Computing machines were once the sole domain of those who could master the complex symbolic systems used to program them, such as COBOL, LISP, and FORTRAN. Even to run a program required mastery of something like Job Control Language. As interactive terminals became popular, menu and function key interfaces became the norm for business applications. Eco would call these interfaces a correlational code, a form of symbolical shorthand providing a set of correlations between sign and object, e.g., numbered menu items where striking the '1' key is a code for "run accounting subsystem". Since they are not generative, codes of this type allow users to invoke only a highly restricted set of the functions in the system in any particular context. The subset that is available in a particular context is displayed on screen during the time that context is active. The classical example is the minicomputer vertical market application of the 1970s: SGT, Ltd.
1. Run Accounting Subsystem
2. Run Inventory Subsystem
3. Run Payroll Subsystem
L. Logoff System
This type of interface at least made it possible to employ non-technical people as data-entry clerks and the like. Still, the connection between the key the user must press and the action taken by the system is entirely a matter of local convention. Another difficulty is that the code is not a straightforward mapping: /1/ doesn't even mean "run the accounting subsystem" everywhere in a single business' system. In another context it might list accounts, or print all account balances.
Meanwhile, software developers at Bell Labs took a radically different approach. Their system, UNIX, was designed for people whose jobs centered around executing designs of their own by controlling computers through a terminal. Foremost among these groups were software engineers, but the system also proved useful for system administrators, research scientists, professional typesetters, engineers, traders in the financial markets, and others. These people were generally adept at symbolic systems and wanted to use the computer in flexible ways, creating new processes out of those supplied with the system, and they wanted to employ all previously articulated system functions regardless of context. They needed something more like human language, with its rich abilities to generate new sentences, and to speak about itself. The system design therefore emphasized combinatorial power and convenience, at the expense of ease of learning. The tools developed (the various UNIX shells and utilities) were more extensible and less context-dependent than the above-mentioned business systems, but because of this it was impractical to list the meanings of all options available at any particular moment — the flexibility of the system meant that there were usually a bewildering variety. Unfortunately, this prevented many people from ever attempting anything with these systems. UNIX only penetrated the general business market when there was a menu-driven business application running over "raw" UNIX. The user interface for MS-DOS, coming a decade after UNIX, was basically just a watered-down version of the UNIX shell.
User interfaces for computer programs took a large step forward when they began to incorporate more indexical and iconic elements in their design. User-interface elements were developed, chiefly at Xerox PARC, which enabled users to point at objects on a bit-mapped display (indexical signing), manipulate them by dragging them with a mouse (again, indexical), and which represented the underlying logical entities by things that "looked like them" on screen (iconism). Because of the fact that indices and icons are understood before symbols — babies recognize the touch of their mother and her face, long before they understand the concept of "mothers" — these novel elements of user interface design proved a boon to new computer users, casual users, and those entering an unfamiliar computing domain. And by adding to text such elements as color, font, and visual setting (e.g., a stop sign next to a warning message in a dialog box), they can enhance an experienced user's interactions with the computer as well.
However, as in any instance where people have found the "greatest thing since the napkin," there is a tendency to exaggerate its benefits. Over the past few years various opinions have been forwarded denouncing adherents to command-line interfaces and symbolic methods of dealing with computers as a "programming elite," "computer priesthood," or, most simply, "nerds." According to this school of thought, the only reasons someone would cling to these antiquated methods of computer-human interaction are a reactionary aversion to change or a vested interest in keeping computer-use an arcane and esoteric subject. For a typical example, see the article by Nixdorf and Kiyooka cited in the bibliography below.
These arguments do not withstand a semiotic analysis of the issues of human-computer interaction. Within this domain, as in other forms of human semiosis (the process of sign interpretation), communication is most quickly established through the use of indexical and iconic signs. When different linguistically distinct cultures first interact, they communicate with each other by gesture, mime, pointing, and by actually dragging the thing they want to mention into view. But this is a brief stage of their interaction, and quickly they learn each other's languages. Without a symbolic mode of semiosis there is too much that is difficult, if not impossible, to discuss.
In fact, the symbolic mode of semiosis is the one that is most distinctively human. Thomas Sebeok contends that semiosis is engaged in by everything we consider living, and is the clearest determinant as to whether we are dealing with a living entity or not. A plant exhibiting a tropism is not growing towards the light through a direct response to the sun, but rather as a result of a chemical intermediary (a sign), which the plant interprets as indicating the presence of light (the object) in some particular direction. Even at the cellular level, our DNA and RNA act as semiotic systems, interpreting signs (the genetic code) as body parts and specific behaviours. But, according to Martin Krampen, in the plant world we find predominantly indexical signs, in the (non-human) animal kingdom, indexical and iconic signs, and only in the human world do we see the full gamut of indices, icons, and symbols. The dance of the bee is always a sign that there is honey there (an indexical sign), never a poem about honey, or reminiscences about wonderful honey from days gone by.
What do we lose by minimizing the amount of symbolic signing potentially present in human-computer interactions? It is the ability to step back from the system and productively reason about it as a system, to disengage from the task of the moment and examine the entire process by which tasks are being performed. Indexical and iconic signs do not facilitate this; symbolic signing does.
P.B. Andersen, in his study of the semiotics of computer systems, makes the distinction between motivated signs, such as a bit map of a formatted letter which "looks like" the object being represented, and arbitrary signs, e.g., Wordstar's "dot language" for representing formatting. He points out that motivated (indexical) signs have the disadvantage of being less manipulable than arbitrary (symbolic) signs. A program like Microsoft Word (my word processor of choice at present) is touted for its advance in ease of use over older interfaces like that of WordStar circa 1985. It is indeed simpler to learn how to italicize a stretch of text in Word than in the original WordStar, a reason why I use it. But when you need to read a file through some kind of filter, extracting the stretches of italicized text, what do you do? If the text was formatted with a "primitive" system, like WordStar's dot language, this task is relatively simple (in awk, C, Basic, a macro language, etc.), but how would you go about this for Word? The file structure behind Word's "translucent" graphical interface takes a good software engineer a couple of weeks to understand.
Current GUIs, composed of motivated signs, are a type of mise-en-scène, where the designer(s) has staged a play, of the realistic school, for the user. Motivated signs — icons, geometric shapes, noises, and representations of what a printed copy of some piece of one of your "documents" would look like — all move around on the stage of cyberspace, making believe that they are familiar objects like file folders, garbage cans, paintbrushes, scissors, human voices, and the like. The point of the staging is to hide the signifier, so that "you'll never even realize that you're working on a computer." Your interaction with the computer-as-computer can be moved into the realm of instinct, much as the knowledge of how to drive a stick- car can be. But the effect is anaesthetic, for the realm of instinct is precisely those actions which we aren't aware of. Iconic signs tend to mask the signifier by masquerading as the signified, as when an icon of a paintbrush and one of a canvas play at being ordinary painting tools. Symbolic signs tend to awaken the user to the knowledge of another plane of meaning, open for analysis, because they do not directly point or relate to the signified. Someone might become frightened at a hidden CD playing the sound of a bear (indexicality), or at a hologram of a bear (iconism) — thinking that a real bear was nearby — but no one mistakes the word /bear/ for an actual instance of one. The fact that a symbol's connection to its object is entirely arbitrary exposes the sign as sign, leading the interpreter to contemplate why the sign should not have been other, eventually giving rise to all of the "meta" activities such as literature, linguistics, puns, the play on words, logic, mathematics, rhetoric, grammar, metaphysics, and so on.
After a lengthy study of human-computer and human-human interaction in the Stockholm Postal Giro, Andersen, borrowing a concept, Verfremdung, from Brecht (which Brecht, in turn, took from the Russian Formalists), warns against excessive realism. A completely "realistic" interface, where you paint with the image of a brush and send e-mail to the image of a person, lacks the capability to talk about itself. It is simple to iconically represent a worker at the Postal Giro performing his normal task, e.g., sorting, but how do you represent him when he has stopped sorting and is analyzing, with a colleague, the flow of envelopes through the sorting room? How do you create an icon which represents an action you want performed on a subset of the other icons in a system? Does it show "you" talking to the icons? But this is no longer intuitive, nor truly iconic — you might as easily interpret this icon as meaning you want the "contents" of all of your "icons" spoken aloud, or that you want the system to eat them, or vomit them up. Dave Kansas of The Wall Street Journal, in an article entitled "The Icon Crisis: Tiny Pictures Cause Confusion," reports on the difficulty users have deciphering the meaning of icons that "depict" abstract (i.e., symbolic) concepts. It turns out that users select the undo function from a menu 80% of the time, as the undo "icon" does not adequately convey the idea of undoing. As Andersen says, "direct engagement and detached analysis and invention require different semiotic focuses. The former focuses on the signified, the latter sometimes also on the signifier . . . The realistic style fails to support analysis and invention and must be modified . . . ."
Modifying this style is just what has been occuring lately. The success of HTML and the Web is due in part to the ease with which the underlying symbolic language can be understood and manipulated. A recent survey of Web content creators showed that, despite the flood of WYSIWIG HTML tools, the majority of them edit HTML directly in a text editor! The ease of building a GUI in HTML contrasts sharply with the arcane obscurity of the Windows API. Andersen's paradigmatic example of a new style of interface is HyperCard, which allows a smooth transition from a GUI to a symbolic-language style of interaction. Apple has added a scripting language to System 7, and Microsoft a Basic interpreter to Office products. The GUI-based Interleaf Document Management System runs on LISP under the hood, and the user has access to the LISP interpreter. All of the UNIX GUIs have allowed a command-line shell window. Each of these GUIs allows user actions to the level of discourse, and move into a symbolic mode, in which it is easier to talk about the system-as-system.
These points are not mere intellectual musing, but relate vitally to the power structures of our workplaces and industries. For one thing, the complexity of programming many GUI interfaces protects entrenched companies — GUI programs, when built from scratch, require much greater capital expenditure before launching a new product than did character-based DOS or UNIX applications. More importantly, when users are only supposed to be aware of a "realistic" illusion of a desktop, and not of the computer itself, then they are fully at the mercy of the computer elite, completely reliant on experts when something goes wrong. The proponents of these "air-tight" interfaces often use cars as an analogy, claiming that "you don't have to be a mechanic to drive." True, but if your livelihood depends on a motor vehicle then you'd better know something about what's under the hood, or you'll find yourself at the mercy of fast-talking used-car dealers, unscrupulous mechanics, and breakdowns on deserted highways. A great painter doesn't forget canvas, brush, and paint, but fully integrates her knowledge of them with her abstract ideas to create art. Similarly, it is important for a computer artist to remember that it is a computer he is working on. To quote Andersen again:
Cast-iron realism prevents the user from getting ideas for modifying her tool and changing her working conditions because the technical workings of the system are so to speak sealed up. The Macintosh system I myself use is a good example of this. If a user wants to go beyond its friendly user interface, he enters a completely new world consisting of files, forks, resources, and similar strange creatures he has never encountered in the use situation. However, this is clearly not a technical necessity but must be seen as an — [in] my view misguided and unfounded — assumption about the roles users wish to occupy.
Indexical and iconic signs have proved to be powerful additions to the art of user interface design, and my point is not to disparage their use. But symbolic signs, by their very nature, will continue to be the most sophisticated method of interacting with computers, at least until our descendants, or another semiotic entity, evolves the next step in the saga of semiosis.
Andersen, P.B. 1990. A Theory of Computer Semiotics. Cambridge, New York, Port Chester, Melbourne & Sydney: Cambridge University Press.
Crichton, Michael. 1993. "Installer Hell," in the September issue of Byte. Peterborough, New Hampshire: McGraw-Hill, Inc.
Eco, Umberto. 1976. A Theory of Semiotics. Bloomington, Indiana: Indiana University Press.
Eco, Umberto. 1984. Semiotics and the Philosophy of Language. Bloomington, Indiana: Indiana University Press.
Hawkes, Terence. 1977. Structuralism and Semiotics. Berkeley and Los Angeles: University of California Press.
Kansas, Dave. 1993. "The Icon Crisis: Tiny Pictures Cause Confusion," in the November 17 Wall Street Journal. New York: Dow Jones & Company, Inc.
Martin, James. 1985. Fourth-Generation Languages. Volume 1. Englewood Cliffs, New Jersey: Prentice-Hall, Inc.
Nixdorf, Troy and Kiyooka, Gen. 1992. "Substance and Style: GUI Design and Culture", in the February issue of Computer Language.
Peirce, Charles S. 1877-1906. Philosophical Writings of Peirce, ed. Justus Buchler. New York: Dover Publications, Inc.
Sebeok, Thomas. 1991. American Signatures: Semiotic Inquiry and Method. Norman and London: University of Oklahoma Press.
Sebeok, Thomas. 1991. A Sign Is Just a Sign. Bloomington and Indianapolis, Indiana: Indiana University Press.
Augustine quoted from: Todorov, Tzvetan. 1982. Theories of the Symbol. Ithaca, New York: Cornell University Press.
This article originally appeared in the Nov. 1994 issue of Software Development, and is reproduced with the permission of Software Development. This version contains modifications to bring it up to date with recent industry developments, and to incorporate suggestions and corrections made by Larry Constantine and Thomas Sebeok.
This page was brought to you by ksh, vi, m4, sed & make,
courtesy of openbsd.
Last changed: Tue Nov 8 17:32:17 CET 2016