Chilton::ACD::Methodology in Interaction

10.19 The Metaphysics of Television

N. Negroponte

Associate Professor of Computer Graphics, Architecture Machine Group

Massachusetts Institute of Technology

Work reported herein has been jointly funded by the Office of Naval research (Contract Number N00014-75-C-0460) and the Defense Advanced Research Projects Agency (Contract Number MDA903-77-C-0037 and MDA903-78-C-0039).

Introduction

Traditionally the measures of computer performance have been MIPS and bytes. These measures are becoming increasingly irrelevant as the human interface lags further and further behind the marvelous developments of micro-electronics. Therefore, we must look toward different characterizations of computing, perhaps akin to megahertz, those which describe the machine's bandwidth to and its interface with the outside world, namely a specific user. The hypothesis continues to the conclusion, unpalatable to the human factors psychologist, that the final measure is totally subjective and idiosyncratic. That is to say: does it feel good?

How well a car holds the road, the taste of wine, and the feel of a keyboard are examples of sensory inputs for which we lack a descriptive system, except for one of mere approximation. One can describe a bottle of Beaujolais as dry and tanic, but the quality of the drink is a subtle and personal matter, to some degree locked inside us forever. This paper predicts a similar future for computers. They will become increasingly personalized and our characterizations of them will be done through metaphor.

Whether or not you (the reader) agree with this picture of the future, you will find rudiments of it in present day computer graphics. There is a general consensus in the computer graphics community that the future of computer graphics is in raster scan displays, in general, and television, in specific. The consensus comes from a number of good reasons, most of which stem from technological achievements attained through the incentives of a huge consumer market. What used to be an incestuous high technology of calligraphics is becoming a ubiquitous medium of video. This paper is not about the technological advantages of digital video. Instead, it is about the additional suggestions offered to us when we do think about television as the future of computer graphics. The following section illustrates by example.

The Parody of the Pixel

One of the most belabored problems of raster scan displays is that of antialiasing, more colloquially known as the jaggies or staircasing. Frequently, the problem is viewed from the perspective of smoothing a ragged line which lies within a small and discrete coordinate system, like 640 by 480 pixels. The result is a mosaic of a quarter of a million pieces, many of which might scintillate. So to avoid that, we limit ourselves to two bit wide lines (so that one is in each interlace). Right? Wrong.

It is the wrong way to think about it. If we think in terms of television we will discover that this line of thought is narrow and misleading. Instead, almost at the other extreme, we can think of our raster scan display as a continuous medium. Note that a thin line (think of a black string) can be presented to a television camera and displayed without jaggies. More importantly, if this string is a vertical line on a 3 foot by 4 foot background, it can be moved a few thousandths of an inch and so noticed on the screen. In short, the proper perspective is a view from sampling theory, treating the image as an arbitrarily high resolution original, and the television as representing that original with a finite number of samples.

This is a very prosaic example of how thinking from the basis of television can be suggestive of a more cogent attitude. More poetic examples are the issues before this group. Four are offered: sound, color, spatiality and movies. They are discussed keeping in mind that television is primarily a unidirectional system, not particularly suggestive of the many input techniques of hands, eyes and body movements. For this reason, this paper addresses only half the story.

Sound Synch Computer Graphics

Cinema discovered the talkies almost at its onset, whereas computer graphics is still a silent movie. We tend to embroil ourselves in the data-structures of presentation to the extreme of making data be mode and medium specific. By contrast, one can consider the same thing presented in many different ways. Or, one can toy with the conjoint output of sound and picture, akin to lip-synch.

In thinking about sound, almost in the theatrical sense, we realize that it has many properties beyond the mere access to our auditory channel. In the same manner that an actor can go off stage while continuing a dialogue, we can view sound space as a much larger space than graphical space. Said another way, sound is not truncated by a bezel.

Consider the very specific example of a sound cursor. With quadraphonic sound, a cursor can be placed in X, Y, even Z and embellished with a doppler effect. How many times have you watched a cursor go off screen and not know if it is looming at the edges or off in hyperspace?

There is also the cocktail party effect. This is the idea of multiple sounds, on and off screen, which can be considered as parallel conversations through which a user can roam. One can draw upon the well known phenomenon of overhearing what one wants to overhear (the proverbial example: one's own name).

Finally, sound can be viewed not only as a datatype, but as a navigational aid as well. Many graphical applications today offer the facility for panning and zooming, with very few navigational aids. One can imagine data calling over here, to your right, a little further . . ." or simply the existence of telling sounds in the distance, like hearing a babbling brook.

These examples are given in the spirit of exemplars. The point is that a topic like computer graphics assumes very different embodiments when considered in a broader perspective. In large measure, this breadth comes from the roots of television, itself a sound synch display medium which we encounter in a large number of activities, frequently at home. In contrast, the radar screen and the storage tube offer different suggestions, noticeably less rich.

Color Copy to Copy

Much fuss is made about when to offer color. Sometimes the tempest comes from an information theoretical perspective, lead by issues of discrimination and clutter. More often, color displays are treated from the cost effectiveness point of view, penny wise and pound foolish.

The economies of color have a distinct trade-off between device and material costs. In photography, for example, the device is extremely inexpensive, but the materials and processing have a great cost. On the other hand, xerographic machinery is rather expensive, but the material cost is almost nothing. The latter is new in color and the former pervades our thinking. The major exception is television!

Many people are astonished that videotape recorders do not care whether the signal is black and white or color. In fact, black and white can be seen as one kind of computational transformation. Black and white photographers perform this same transformation when looking through their view finder. The amateur is frequently disappointed by the contrast ratios which remain when the chrominence components are removed.

The reverse transformation is done by the television camera operator who is given (today) a small black and white monitor through which to frame and focus. Students of video are frequently surprised with their first encounter with black and white videotaping inasmuch as the camera's viewfinder offers a black and white image, already transformed. All of this suggests that a healthy view of image-making would come from a full-color base with transformations performed for whatever purpose - black and white being only one such transformation.

Over time we will resort to thinking in color, to writing in color and generally to working in color. We are given color pencils and crayons as children and then, in a very real sense, they are taken away from us as adult professionals . . . how do you copy the results? It is not just a plethora of Xerox 6500's that will change this state, but a move to softcopy. Management information systems will probably be at the vanguard with the return of the red number and the disappearance of the parenthesis. Then slowly, computer aided instruction will see a growth of animation techniques and the line between computer graphics and television will become even thinner.

Spatiality

By spatiality we mean two things: the engagement of peripheral vision in large format displays and the employment of the sense of place in interaction. Both have radical implications for the way we think about displays. For example, previously we could assume that if a user were looking at a display, he or she was looking at all of it. Similarly, we have been quite content to think of all query languages as symbolic manipulations of one sort or another, with a lexicon of commands, frequently set theoretic.

There are three distinct developments in television technology: largeness, flatness and high resolution. They are underscored as being separate technologied steps because all too frequently they are lumped into one (this is especially true for largeness and flatness). Whereas each has particular advantages, to be discussed later, the primary interest of this subsection is largeness.

The Spatial Data Management System illustrated in the color photo uses a 6 foot by 8 foot rear-projected television as the main stage of events. (The other display is a navigational aid.) In today's terms, such a television image is unique, though it will become increasingly commonplace. At such time, it will have a number of implications which today's thinking ought to include. Consider the added problems that arise with sufficient resolution to allow for the user to approach the display and view it from a very short distance such that the majority of the image is outside the field of vision.

Full image ⇗
© UKRI Science and Technology Facilities Council

Eye and body tracking will play significant roles in this brand of man-machine interaction. This will be true from the point of view of gaining and knowing about the user's attention, as well as of astutely applying bandwidth at the user's focus. The assumptions are that no matter how large the display, the data will be larger and no matter how much bandwidth is available the demand will exceed it.

In turn, the spatiality of interaction is drawn less from television and more from the traditional media of books, pencils, paper and from the organizational structures of desk tops, bulletin boards, and the like. Digital video offers a surrogate, like a viewfinder into a mythical land of data. The notion is culled from a number of observations that suggest that humans organize and retrieve data successfully in terms of its location, especially if that person put it there. A personal library is an excellent example. We tend to know where our books are, not by following an alphabetical order, but by remembering their position on the shelf. Even if that position is not known absolutely, it can be managed relatively (i.e. beside the large red book).

What is being called the metaphysics of television makes a 2½ dimensional approach to the problem the most feasible, at first this was viewed as a compromise from flight simulator graphics, likening the data management problem to the design of and roaming through a well known city. Upon closer inspection of the problem the paradox is that the increase in navigational aids required to peruse 3-D space leads to the symbolic trap, akin to instrument flying, happily done in the fog.

Movies

Hitherto, computer graphics has stood alone with very little momentum and breadth. But increasingly, three distinct unrelated graphic communities are beginning to merge into one, the result of which will have enormous impact on computer graphics. Image processing, broadcast, and computer graphics have been separate disciplines with separate publications, separate conferences and a separate cast of characters. The confluence of these disciplines offers a potential impact, none of which could have been generated alone. The idea of an interactive movie is a specific example, especially when viewed in the broadest sense, as an interactive graphic.

In the late 60's and early 70's a large amount of research was devoted to realism, nonetheless simulated, in computer graphics. The impressive results were at the expense of interactive techniques inasmuch as the computational overhead precluded them. More recent work has focused on the topic of interaction, with the tacit assumption of abstractions, usually simple line drawings, without photographic presence.

At this writing we are at a major turning point, to be caused by the optical videodisc as a pivotal element in our configurations of the future. What previously had to be compartmentalized as graphics or photographies can be merged in secondary storage as well as within the gestalt to television. If we consider the specific example of mapping, it is alarming to consider how separate the two modes of representation, maps and photographs, have remained. And, movies as a bonafide data type, are almost without example, considered by most to be outside the scope of the man-machine scenario.

The random access of large amounts of pictorial data holds so many unknown promises that speculation will be bland compared to eventuality. What is important to the future of interactive graphics, is the recognition of real, not simulated pictorial elements, described with some degree of semantics. In the case of currently available videodiscs, 54,000 individual frames (namely not planned as a movie) is far too large for a user to interact with except through the intermediary of a computer, which in turn knows what is on the disc.

In the case of movies, the entire history of film-making is subject to change because the user or the computer becomes editor. A filmic cookbook offers a good example. A half hour of movie, broken into five second snippets provides 360 scenes. Each scene could be a discrete cooking event like: pouring milk, cracking an egg, cutting an onion. Connected to a computer, these snippits could be assembled in many different orders for a large number of recipes, interspersed with animation.

Similarly, in a maintenance and repair application, movies could be played forward and backward, with different degrees of elaboration, depending upon how much the user knew about such and such. Combined with realtime animation, one begins to see a future of interactive graphics filled with a much broader interpretation of image making. In large measure it is the demise of calligraphics.

Terminals of the Future

Some terminals of the future will be all knowing rooms without walls. Others will be flat, thin, flexible touch sensitive displays. And others will be wrist watches and cuff links with the right hand talking to the left by satellite. In all cases, these will be accompanied by input technologies for voice, hand and eye. In most cases, the future is driven by television, the most important next step of which is the solid state display.

A number of technologies are vying for the lead position. Which one assumes it is to some degree unimportant. What is important to the future of thinking about these terminals is the element of portability, the notion of size and the simple feature of flatness.

Whether you wear it or carry it, the portable terminal opens the option to crawl into bed or seek out a quiet spot with an intellectual engine. At present people really do not use computers to help them think; in fact, most computers preclude it, frequently for simple reasons. They make too much noise, they are in the wrong place, or they are not with you when you need them. It will be interesting to observe how much more demanding people will become about the interface when they find themselves using computers under situations which are not already stressful conditions.

Size is more than just its quantitative extent. It is more than displaying the Bible on a button or a weather map on a wall. There are qualitative aspects of screen size and theme size, about which we know very little. Just like Star Wars would have little presence on a five inch portable TV, Upstairs Downstairs would offer an empty silver screen. The idea of a display which knows its own size, is beyond our current thinking, but so obviously important. Amongst other things, the idea of a coordinate system is to be replaced by yet uninvented semantic scale.

Finally, one of the simplest features, namely that of flatness, goes unacclaimed in our current systems. The need of parallax free interaction with dynamic graphics has been answered with such clumsy solutions, that users have not had the occasion to consider the obvious but more subtle detail of drawing on a flat surface. In the example on the next page, the impetus came from the desire to see through one's hand so as to not occlude any portion of the display. The major result, however, was that flatness alone offered the greatest increment of usability in the mapping application.

The Noticeable Difference

Without comment on the fantastic potentials for new input techniques and without comment on the artificial intelligence that will personalize the interface, it is too easy to conclude that this short paper is about the innovations in semi-conductive materials, better and bigger or smaller. New gadgets will have their impact. More important is the way we go about studying these innovations, using them and encouraging them. Most importantly, we must come out of the self-serving shell of conservative techniques belabored by and justified with the quantitative methods that make bureaucrats comfortable.

We are beginning to see the lifting of a professional recalcitrance, characterized by a preoccupation with the dot and the line. The point is to encourage this change with variegated perspectives in the service of big, not just noticeable, differences. Some of these differences will happen almost automatically. Like: prices will drop and the user community will skyrocket. Others will come from attitudes of both/and (versus either/or). But, most creative thoughts will come from a kind of magnanimity which at once can draw upon a technical breadth and can ask without embarrassment the very simple question - does it feel good?