Understanding how humans interact with computer systems is surely an important issue. This hardly needs to be stated. Each designer of an interactive system must confront this issue in the way his system interfaces to the user, but he does this without any systematic knowledge or technique to help him. It is purely a seat-of-the-pants art, with all the attendant features that that implies-interesting innovations, isolated flashes of brilliance, lots of imitation of stylistic features, and mostly things that don't really work (the same mistakes seemingly being made over and over again) - in short, the dominance of function by (superficial) form. But the form of the user interface must serve a function, and we need to enhance the designer's ability lo predict and control how il will function. How should we go about this? I will begin by sketching a few approaches.
The following are some approaches to studying the interaction between humans and computers, most of which are being investigated by the AIP group at Xerox PARC:
AIP stands for the Applied Information Processing Psychology Project. This is a small group including myself, Stuart Card, and Allen Newell (who consults regularly with us from Carnegie-Mellon University). Our goal in AIP is lo create an applied psychology that is directly useful to the system designer. Our initial efforts are being written up in a forthcoming book [1].
(In AIP we have mostly concentrated on models that predict performance time by expert users, and we have dabbled a bit in learning models. Don Norman and Dave Rumelhart at the University of California at San Diego, are exploring some approximate models of human short-term memory that might be useful to system designers.
A of these approaches - plus others - should be pursued. They are not independent, and they all support each other, I have been leading up to the last approach, which I would like to discuss further, not because the other approaches are less worthy, but because the notion of a design representation fills in where the other approaches are deficient. The most important feature of the design representation approach is its completeness - it must cover all phases of design and all aspects of user interfaces. Thus, it can serve as a framework to organize user interface issues. The rest of this paper will be devoted to exploring one attempt at a design representation.
It is useful to begin by defining what we mean by the user interface. The user interface of a system is usually contrasted with its implementation. The user interface is that part of a system that the user comes in contact with physically, perceptually, or conceptually. By this definition, the user interface is not always what the designer intends, for many a system compels the user to be aware of some aspects of its implementation. For example, consider the user who clears and reinitializes the display so that the system will respond faster (because he knows that this will clean up some of the system's internal data structures). This gross characterization of the internal workings of the system thereby becomes part of the user interface. Another way to state this is that the user will build a mental model of a system as he learns it and uses it. Any aspect of the system that shows through to the user and enters the user's model is a part of the user interface.
The basic tenet of this paper is that to design the user interface is to design the user's model. The designer needs to understand the functional structure of the user interface, the pieces of which it is composed and how they work together. And he needs to understand what knowledge goes into the user's model. In this paper I propose an analysis of the structure of the user interface that at once is useful for the designer and sheds some light on the psychology of the user's model.
I believe that a useful way to look at the variety of interactive systems is as command language systems. A command language is characterized by a command-execute cycle: the user specifies a command to the system which immediately executes it, then the user specifies another command, and so on. This can be contrasted with a programming language, in which a set of commands is built up before execution. There is, of course, no hard line between these. As a command language delays execution, it becomes more like a programming language. (While programming languages are central to Computer Science, command languages are but a minor topic under the area of operating systems.) Command language systems can also be contrasted with natural language systems. Here the variation is in complexity of grammar and subtlety of interpretation. Command languages are very simple compared to natural languages, which lie at the extreme of complexity. But command language systems can be quite sophisticated in their use of syntactic and conversational devices that are characteristic of natural language.
In this paper we will focus on the notion of a command language system. We will articulate the structure of command language systems by developing a representation for describing them that is tailored to their structure. The representation is called CLG - the Command Language Grammar. It is called a grammar because it can be used to generate a wide variety of command language system descriptions. Just how varied (e.g., whether it also covers programming languages) is an open question, but surely enough systems are included to make this a useful paradigm. CLG is thus a representation that designers can use to create new systems.
A CLG representation of a system is a description of the system as the user sees it and understands it, that is, the user interface. The psychological hypothesis behind CLG is that CLG describes the user's mental model of the system. Since this model is exactly what the designer of the user interface should be working with, CLG is also structured to be useful during the system design process.
Conceptual Component: Task Level Semantic Level Communication Component: Syntactic Level Interaction Level Physical Component: (Spatial Layout Level) (Device Level) Level Structure of CLG
The structure of CLG is outlined in the figure above. There are three major components to the user interface of a system. The Conceptual Component contains the abstract concepts around which the system is organized; the Communication Component contains the command language and the conversational dialog; and the Physical Component contains the physical devices that the user sees and comes in contact with. CLG is organized around these components. A CLG representation is made up of a sequence of description levels, each level being a refinement of previous levels. The Task Level describes the task domain addressed by the system, and the Semantic Level describes the concepts represented by the system. Together these two Levels define the Conceptual Component of the system. The Syntactic Level describes the context and command-argument structure, and the Interaction Level describes the dialog structure. These make up the Communication Component. The Spatial Layout Level describes the arrangement of the input/output devices and the display graphics; all the remaining physical features are described at the Device Level.
I have done little work as yet on the last two CLG Levels (the Physical Component), and they will not be discussed much here. However, the reader should keep them in mind, especially the Spatial Layout Level, as the place-holders for the issues of display design, which will appear to be missing from CLG. We will concentrate on exploring the first four CLG Levels:
CLG has precise conventions for specifying the information at all Levels. There is little space in this paper to spell out the details of CLG's descriptive notation; a more complete presentation can be found in [3]. Instead, I will simply list some of the important features of the CLG representation.
The most important feature of CLG is its stratification into Levels. The purpose of the Level structure of CLG is to separate the conceptual model of a system from its command language and to show the relationship between them. Thus, the conceptual model can be explicitly laid out and dealt with by the system designer (or by the cognitive psychologist studying the user's model). The CLG Levels are arranged so that they map onto each other in sequence. Each Level describes a class of systems by abstraction, and each Level makes only limited assumptions about the system being described. This can be interpreted as a top-down design sequence; first the conceptual model is created, then a command language that implements that conceptual model is designed (and then a display layout is designed to support the command language).
The Task Level defines the goals of the system by enumerating a set of benchmark tasks against which to evaluate it. This Level only assumes the gross functionality of the system and a unit task structure of user behavior (see [4]).
The Semantic Level simply enumerates the concepts embodied in the system by mapping from the Task Level; there is no deep analysis of these concepts. There is no concern at this Level for the communication of concepts; concepts are defined abstractly (e.g., with no commitment as to how they are named or how they will be graphically represented).
The Syntactic Level is concerned with how the Semantic concepts can be packaged so that they can efficiently be communicated. This Level asserts that all communication can be structured by a few syntactic building blocks. One of the major deficiencies in CLG at present is the lack of generative mapping rules between the Semantic and Syntactic Levels. Such rules would pin down the space of possibilities in going from conceptual models to command languages.
The Interaction Level is concerned with the dynamics of interaction. This Level asserts that all interactions am be composed out of a few functional constituents (e.g., prompts, designations, terminations, etc.). These constituents are generated by mapping from the Syntactic elements. This Level (the most highly developed thus far in CLG) describes interactions by using patterned rules, which clearly capture the consistency (or inconsistency) of a system's interaction conventions.
The Spatial Layout Level is concerned with exactly how the system's display will look at each point during the interactions. Note that, even though we may be representing a highly graphic system in CLG, the issue of the graphics per se does not show up until many Levels down. Which is to say, the graphics is embedded in a conceptual and a communication context, and its function is to support those contexts. In this sense, there is little structural difference between graphics and "non-graphics systems.
CLG can describe user interfaces. But we have here the potential for much more than this bland statement suggests. I believe that it useful to look at CLG from at least three different views:
The fact that I propose three views of CLG does not reflect any ambivalence. Rather, I see CLG in all of these ways at once. The linguistic view is concerned with the constraints imposed on the interaction by the system; the psychological view is concerned with the constraints imposed by the user; and the design view is concerned with bringing the two together. The payoff for CLG is its usefulness in design, but the design view of CLG rests on the other two views: on the linguistic view for its generative capacity and on the psychological view for its psychological validity. If all of these views can be resolved into the same representational system, then the system designer can truly be working with the user's model.
I have introduced the notion of a design representation as one approach (out of many) to studying human-computer interaction and the design of user interfaces. I have proposed CLG as a specific candidate for such a representation. I am not promoting CLG per se, but rather the general notion of a comprehensive framework for bringing together a variety of user interface issues. Having CLG around as an example of such a representation helps clarify what we really want and don't want in one. Several features of CLG, such as the structuring into levels and the limited stock of command language building blocks, would seem to be desirable in any proposal. CLG is also severely (but not necessarily irreparably) deficient in several respects. For example, it is a long way from being able to describe a system to helping the designer make design decisions. Any proposal must undergo considerable empirical testing to see if it actually helps designers.
In closing, let me give an example of something that I would like a proposed framework to do. I have a collection of hundreds of nits about various systems in common use around Xerox. A nit is a specific complaint by a user about some particular interaction feature of a system. (Nits are familiar to any computer user, for example: I can never remember the name of the X command, the X command always assumes the current line, which is often not what I want, the X button means Y sometimes and Z at different times, the system doesn't allow me to do task X, the X command, which is needed frequently, is too awkward lo call, while trying to call the X command, it is too easy to slip and call the Y command, I can't tell what the system is doing during the X command, I can't understand the error message from the X command, and on and on.) Each nit suggests one or more design principles that would alleviate the situation (and often conflicting principles that support the situation). The principles are easy to generate (setting aside the issue whether each one is empirically correct or not). The problem is that we are overwhelmed with principles. What is needed is a framework for making sense of them, for telling when they apply, and for resolving the conflicts between them. CLG was created in response to this problem, but I have not yet applied CLG to the collection of nits and principles. It remains to be seen whether CLG will help. In any event, I suggest this as a good test for any proposed framework.
[1] S. K. Card, T. P. Moran, & A. Newell. Applied Information-Processing Psychology: The Human-Computer Interface. Book in preparation.
[2] S. K. Card & T. P. Moran. A Simple Model for Predicting User Efficiency with Interactive Systems. Xerox Palo Alto Research Center, Report SSL-78-4, 1978, (submitted to Communications of the ACM}.
[3] T. P. Moran. Introduction to the Command Language Grammar: A Representation for the User Interface of Interactive Computer Systems. Xerox Palo Alto Research Center, Report SSL-78-3, 1978.
[4] S. K. Card, T. P. Moran, & A. Newell. The Manuscript Editing Task: A Routine Cognitive Skill. Xerox Palo Alto Research Center, Report SSL-76-8, 1976, (submitted to Cognitive Psychology).