Chilton::ACD::Methodology in Interaction

10.17 A Framework for Studying Human-Computer Interaction

Thomas P. Moran

Systems Sciences Laboratory

Xerox Palo Alto Research Center

1. Introduction

Understanding how humans interact with computer systems is surely an important issue. This hardly needs to be stated. Each designer of an interactive system must confront this issue in the way his system interfaces to the user, but he does this without any systematic knowledge or technique to help him. It is purely a seat-of-the-pants art, with all the attendant features that that implies-interesting innovations, isolated flashes of brilliance, lots of imitation of stylistic features, and mostly things that don't really work (the same mistakes seemingly being made over and over again) - in short, the dominance of function by (superficial) form. But the form of the user interface must serve a function, and we need to enhance the designer's ability lo predict and control how il will function. How should we go about this? I will begin by sketching a few approaches.

2. Some Approaches

The following are some approaches to studying the interaction between humans and computers, most of which are being investigated by the AIP group at Xerox PARC:

AIP stands for the Applied Information Processing Psychology Project. This is a small group including myself, Stuart Card, and Allen Newell (who consults regularly with us from Carnegie-Mellon University). Our goal in AIP is lo create an applied psychology that is directly useful to the system designer. Our initial efforts are being written up in a forthcoming book [1].

System Evaluation Methodology.: Given a system, we would like to objectively evaluate it's performance. We can partition a system's performance into (at least) four independent dimensions; learning (how long it takes a novice to learn to do basic tasks with the system), efficiency (how fast can an expert interact with the system to do the basic tasks), functionality (how many different tasks can be done practically with the system), and error-proneness (how frequently do users make errors and how severe are the consequences). (This scheme is taken from Terry Roberts, a graduate student at Stanford. She is working in the AIP group on a thesis in which she is creating a methodology for evaluating computer-based text editors.) For each dimension there must be a procedure (a test or an analysis) for measuring the system. The system's rating, then, is a 4-tuple of measures. The problem with this approach is that the system must be built and in use (long enough for there to be experts) before it can be evaluated, and thus it does not help during design.
Iterative Design-Test Methodology.: Given a partially implemented system, we would like to be able to conduct a few tests that indicate where there are problems with the system. Then changes could be made to the system, and the cycle repeated. There are many issues involved in making this really work, and there are many details to pin down to make it a methodology available to all designers. The tests must be quick (else there won't be many cycles), yet they must feedback useful information on what to do next. What kinds of tests should one conduct? What kinds of users are appropriate? What kinds of data should be collected? Etc. I don't know of any specific proposals for such a methodology (although I've seen a few Ph.D. theses proposing design methodologies). The problem with this approach is that it doesn't help with the early stages of design. It is more suited to the later stages of system refinement. Also, this approach could be quite slow, at least compared to the analytic approaches proposed below, if the changes to the system cannot be made quickly. (Many designers assert that we cannot predict anything about a user interface without trying it out, which would seem to make the development of this approach especially important.)
Performance Prediction from Models.: Given only a specification for a system, we would like to be able to predict some of its performance characteristics. We do this by applying some models of the user to the system specification. Such models are only approximate, of course, but they are based on the human as an information processor. We may require several models, since each may only be able to tell us about specific characteristics. For example, in AIP we have developed a simple model for predicting how long it will take an expert to do a given task on a given system [2]. One of the problems with this approach is that it usually requires a detailed specification of the system, and thus does not help with early design decisions. Another problem is that it is at present unclear how many behavioral characteristics can be captured in such models. I am reasonably confident that we can develop approximate models for learning, but I am not sure that we will be able to predict errors very well.
(In AIP we have mostly concentrated on models that predict performance time by expert users, and we have dabbled a bit in learning models. Don Norman and Dave Rumelhart at the University of California at San Diego, are exploring some approximate models of human short-term memory that might be useful to system designers.
Simulation >with an Artificial User.: System engineers often build simulations of systems, usually to study the performance of different hardware configurations. One issue with these simulations is how to represent the user performance to drive the simulation. One approach is to create an artificial user, a model of the user that generates statistically accurate user behavior. An artificial user is a slightly different way (from the models above) to package knowledge about user behavior, but the packaging is important if it is to be readily accessible to designers.
Design Handbook.: This is not so much a separate approach as a guiding vision for AIP. What is envisioned here is a collection of data and models organized around a collection of typical design subproblems that system designers inevitably face.
Design Representation.: A representation provides a way for a designer to describe a system as it evolves. A good representation can be used in the early conceptual phases of design, as well as in the later detailed phases. A representation provides a definition of the space of user interface possibilities. A good representation, by its very structure, will guide the designer to think about the right issues (i.e., by what it requires him to specify and in what form it is required) and will guide the designer into good interface design (i.e., things that are natural to describe in the representation will at least make reasonable interfaces).

A of these approaches - plus others - should be pursued. They are not independent, and they all support each other, I have been leading up to the last approach, which I would like to discuss further, not because the other approaches are less worthy, but because the notion of a design representation fills in where the other approaches are deficient. The most important feature of the design representation approach is its completeness - it must cover all phases of design and all aspects of user interfaces. Thus, it can serve as a framework to organize user interface issues. The rest of this paper will be devoted to exploring one attempt at a design representation.

3. The User Interface

It is useful to begin by defining what we mean by the user interface. The user interface of a system is usually contrasted with its implementation. The user interface is that part of a system that the user comes in contact with physically, perceptually, or conceptually. By this definition, the user interface is not always what the designer intends, for many a system compels the user to be aware of some aspects of its implementation. For example, consider the user who clears and reinitializes the display so that the system will respond faster (because he knows that this will clean up some of the system's internal data structures). This gross characterization of the internal workings of the system thereby becomes part of the user interface. Another way to state this is that the user will build a mental model of a system as he learns it and uses it. Any aspect of the system that shows through to the user and enters the user's model is a part of the user interface.

The basic tenet of this paper is that to design the user interface is to design the user's model. The designer needs to understand the functional structure of the user interface, the pieces of which it is composed and how they work together. And he needs to understand what knowledge goes into the user's model. In this paper I propose an analysis of the structure of the user interface that at once is useful for the designer and sheds some light on the psychology of the user's model.

I believe that a useful way to look at the variety of interactive systems is as command language systems. A command language is characterized by a command-execute cycle: the user specifies a command to the system which immediately executes it, then the user specifies another command, and so on. This can be contrasted with a programming language, in which a set of commands is built up before execution. There is, of course, no hard line between these. As a command language delays execution, it becomes more like a programming language. (While programming languages are central to Computer Science, command languages are but a minor topic under the area of operating systems.) Command language systems can also be contrasted with natural language systems. Here the variation is in complexity of grammar and subtlety of interpretation. Command languages are very simple compared to natural languages, which lie at the extreme of complexity. But command language systems can be quite sophisticated in their use of syntactic and conversational devices that are characteristic of natural language.

In this paper we will focus on the notion of a command language system. We will articulate the structure of command language systems by developing a representation for describing them that is tailored to their structure. The representation is called CLG - the Command Language Grammar. It is called a grammar because it can be used to generate a wide variety of command language system descriptions. Just how varied (e.g., whether it also covers programming languages) is an open question, but surely enough systems are included to make this a useful paradigm. CLG is thus a representation that designers can use to create new systems.

4. Overview of CLG

A CLG representation of a system is a description of the system as the user sees it and understands it, that is, the user interface. The psychological hypothesis behind CLG is that CLG describes the user's mental model of the system. Since this model is exactly what the designer of the user interface should be working with, CLG is also structured to be useful during the system design process.

Conceptual Component:            Task Level 
                                 Semantic Level
Communication Component:         Syntactic Level
                                 Interaction Level
                                 
Physical Component:              (Spatial Layout Level) 
                                 (Device Level)
                                 
                   Level Structure of CLG

The structure of CLG is outlined in the figure above. There are three major components to the user interface of a system. The Conceptual Component contains the abstract concepts around which the system is organized; the Communication Component contains the command language and the conversational dialog; and the Physical Component contains the physical devices that the user sees and comes in contact with. CLG is organized around these components. A CLG representation is made up of a sequence of description levels, each level being a refinement of previous levels. The Task Level describes the task domain addressed by the system, and the Semantic Level describes the concepts represented by the system. Together these two Levels define the Conceptual Component of the system. The Syntactic Level describes the context and command-argument structure, and the Interaction Level describes the dialog structure. These make up the Communication Component. The Spatial Layout Level describes the arrangement of the input/output devices and the display graphics; all the remaining physical features are described at the Device Level.

I have done little work as yet on the last two CLG Levels (the Physical Component), and they will not be discussed much here. However, the reader should keep them in mind, especially the Spatial Layout Level, as the place-holders for the issues of display design, which will appear to be missing from CLG. We will concentrate on exploring the first four CLG Levels:

Task Level.: The user comes to the system with a set of tasks that he wants to accomplish. The purpose of the Task Level is to analyze the user's needs and to structure his task domain in a way that is amenable to an interactive system. The output of this Level is a structure of specific tasks that the user will pose to himself with the aid of the system.
Semantic Level.: A system is built around a set of objects and manipulations of those objects. To the system these are data structures and procedures; to the user they are conceptual entities and conceptual operations on these entities. The Semantic Level lays out these entities and operations. The entities and operations are supposed to be useful for accomplishing the user's tasks; they are a way of representing the system's functional capability. Thus, the Semantic Level also specifies methods for accomplishing the tasks' (from the Task Level) in terms of these entities and operations.
Syntactic Level.: The conceptual model of the system is embedded in a language structure, the command language, for users to communicate to the system. All command languages are built out of a few syntactic elements - commands, arguments, contexts, state variables, and descriptive notations. The Syntactic Level lays out these elements. The meaning of each command of the system is defined in terms of operations at the Semantic Level, and the methods at the Semantic Level are recoded in terms of Syntactic Level commands.
Interaction Level.: The dialog conventions for the user-system interaction must ultimately be resolved as a sequence of physical actions - key presses and other primitive device manipulations by the user and display actions by the system. The Interaction Level specifies the physical actions associated with each of the Syntactic Level elements, as well as the rules governing the dialog (i.e., when the user and when the system acts).

5. Features of CLG

CLG has precise conventions for specifying the information at all Levels. There is little space in this paper to spell out the details of CLG's descriptive notation; a more complete presentation can be found in [3]. Instead, I will simply list some of the important features of the CLG representation.

The most important feature of CLG is its stratification into Levels. The purpose of the Level structure of CLG is to separate the conceptual model of a system from its command language and to show the relationship between them. Thus, the conceptual model can be explicitly laid out and dealt with by the system designer (or by the cognitive psychologist studying the user's model). The CLG Levels are arranged so that they map onto each other in sequence. Each Level describes a class of systems by abstraction, and each Level makes only limited assumptions about the system being described. This can be interpreted as a top-down design sequence; first the conceptual model is created, then a command language that implements that conceptual model is designed (and then a display layout is designed to support the command language).

The Task Level defines the goals of the system by enumerating a set of benchmark tasks against which to evaluate it. This Level only assumes the gross functionality of the system and a unit task structure of user behavior (see [4]).

The Semantic Level simply enumerates the concepts embodied in the system by mapping from the Task Level; there is no deep analysis of these concepts. There is no concern at this Level for the communication of concepts; concepts are defined abstractly (e.g., with no commitment as to how they are named or how they will be graphically represented).

The Syntactic Level is concerned with how the Semantic concepts can be packaged so that they can efficiently be communicated. This Level asserts that all communication can be structured by a few syntactic building blocks. One of the major deficiencies in CLG at present is the lack of generative mapping rules between the Semantic and Syntactic Levels. Such rules would pin down the space of possibilities in going from conceptual models to command languages.

The Interaction Level is concerned with the dynamics of interaction. This Level asserts that all interactions am be composed out of a few functional constituents (e.g., prompts, designations, terminations, etc.). These constituents are generated by mapping from the Syntactic elements. This Level (the most highly developed thus far in CLG) describes interactions by using patterned rules, which clearly capture the consistency (or inconsistency) of a system's interaction conventions.

The Spatial Layout Level is concerned with exactly how the system's display will look at each point during the interactions. Note that, even though we may be representing a highly graphic system in CLG, the issue of the graphics per se does not show up until many Levels down. Which is to say, the graphics is embedded in a conceptual and a communication context, and its function is to support those contexts. In this sense, there is little structural difference between graphics and "non-graphics systems.

6. Three Views of CLG

CLG can describe user interfaces. But we have here the potential for much more than this bland statement suggests. I believe that it useful to look at CLG from at least three different views:

Linguistic View.: CLG articulates the structure of command language systems. The primary function of CLG, in this view, is to generate the space of command language systems in a principled way (thus providing for an implicit taxonomy of interactive systems).
Psychological View.: We hypothesize that a CLG description of a system is a description of the user's model of that system. The main concern, in this view, is to validate this hypothesis. As it stands CLG asserts that the user knowledge of a system is quite varied (and redundant) - including task structures, abstract concepts, intricate details of the command and interaction structure, and procedures at various levels of detail. To the extent that CLG captures what an expert user knows about a system, we have the possibility of being able to predict the complexity of a system (i.e., how easy it is to learn) by some sort of symbol-counting algorithm on CLG descriptions. Also, since CLG lays out the relationships among the knowledge elements, we can use CLG descriptions to partition a system into relatively independent subsystems (that can be learned independently).
Design View.: CLG implies a top-down design process (with iteration, of course) by laying out the sequence of design decisions, that is, the decisions needed to fill out the description at each successive CLG Level. By making complete descriptions of the system at various levels of abstraction, CLG provides for the possibility of evaluating the system at each stage of design. For example, it should be possible to evaluate the conceptual model of a system by how simple it makes doing the tasks that the system is supposed to do. CLG descriptions should also be adequate to compute measures of efficiency, optimality, memory load, error-proneness, learning, etc. CLG descriptions could also be used to run simulations. (It might even be reasonable to consider it as a sort of metalanguage for implementation, though this is not one of its goals.)

The fact that I propose three views of CLG does not reflect any ambivalence. Rather, I see CLG in all of these ways at once. The linguistic view is concerned with the constraints imposed on the interaction by the system; the psychological view is concerned with the constraints imposed by the user; and the design view is concerned with bringing the two together. The payoff for CLG is its usefulness in design, but the design view of CLG rests on the other two views: on the linguistic view for its generative capacity and on the psychological view for its psychological validity. If all of these views can be resolved into the same representational system, then the system designer can truly be working with the user's model.

7. Conclusion - the Need for a Framework

I have introduced the notion of a design representation as one approach (out of many) to studying human-computer interaction and the design of user interfaces. I have proposed CLG as a specific candidate for such a representation. I am not promoting CLG per se, but rather the general notion of a comprehensive framework for bringing together a variety of user interface issues. Having CLG around as an example of such a representation helps clarify what we really want and don't want in one. Several features of CLG, such as the structuring into levels and the limited stock of command language building blocks, would seem to be desirable in any proposal. CLG is also severely (but not necessarily irreparably) deficient in several respects. For example, it is a long way from being able to describe a system to helping the designer make design decisions. Any proposal must undergo considerable empirical testing to see if it actually helps designers.

In closing, let me give an example of something that I would like a proposed framework to do. I have a collection of hundreds of nits about various systems in common use around Xerox. A nit is a specific complaint by a user about some particular interaction feature of a system. (Nits are familiar to any computer user, for example: I can never remember the name of the X command, the X command always assumes the current line, which is often not what I want, the X button means Y sometimes and Z at different times, the system doesn't allow me to do task X, the X command, which is needed frequently, is too awkward lo call, while trying to call the X command, it is too easy to slip and call the Y command, I can't tell what the system is doing during the X command, I can't understand the error message from the X command, and on and on.) Each nit suggests one or more design principles that would alleviate the situation (and often conflicting principles that support the situation). The principles are easy to generate (setting aside the issue whether each one is empirically correct or not). The problem is that we are overwhelmed with principles. What is needed is a framework for making sense of them, for telling when they apply, and for resolving the conflicts between them. CLG was created in response to this problem, but I have not yet applied CLG to the collection of nits and principles. It remains to be seen whether CLG will help. In any event, I suggest this as a good test for any proposed framework.

References

[1] S. K. Card, T. P. Moran, & A. Newell. Applied Information-Processing Psychology: The Human-Computer Interface. Book in preparation.

[2] S. K. Card & T. P. Moran. A Simple Model for Predicting User Efficiency with Interactive Systems. Xerox Palo Alto Research Center, Report SSL-78-4, 1978, (submitted to Communications of the ACM}.

[3] T. P. Moran. Introduction to the Command Language Grammar: A Representation for the User Interface of Interactive Computer Systems. Xerox Palo Alto Research Center, Report SSL-78-3, 1978.

[4] S. K. Card, T. P. Moran, & A. Newell. The Manuscript Editing Task: A Routine Cognitive Skill. Xerox Palo Alto Research Center, Report SSL-76-8, 1976, (submitted to Cognitive Psychology).