Chilton::INF::Window Management

11. Partitioning of Function in Window Systems

James Gosling

11.1 INTRODUCTION

What is a window system? It is a manager for graphical display devices that divides the display surface into several windows along with a graphics library that understands them. In a multi process environment each window may be supported by a separate process. This paper examines the architectural issues involved in constructing a window system in such an environment. Emphasis is placed on the communication and synchronization problems that arise.

Client processes, running in parallel, invoke the window system to manipulate windows and to perform graphic operations in them. A window system is faced with several client programs which all need access to windows. One of its hardest tasks is to coordinate client access to windows, while at the same time providing maximum performance and avoiding synchronization problems.

As an example of the synchronization problems that can occur, consider moving a window. The user sitting at the workstation issues a command that causes a window to move. This is happening in parallel with the client programs which are all madly drawing in their windows. A process whose window becomes partially obscured can find itself halfway through drawing a line with a window that is no longer the shape it thought it was. The drawing of the line and the moving of the window must be synchronized to avoid collisions of this type.

Because of its existence in a multiprocess environment the window system must be partitioned. The various bodies of code that make it up must be placed somewhere. Partitioning has a strong effect on synchronization and performance.

There are three basic ways in which the function of a window system can be partitioned:

it can be in the same address space as the client process, replicated for each client;
it can be outside the address space of the client process, in the kernel of the operating system;
it can be outside the address space of the client process, in a server process that is not a part of the kernel.

The distinctions are rather fuzzy since the partitioning usually ends up being a mixture of these three: some part is in the kernel, some part is in each client process, and some part is often in some other server process. The question is really one of emphasis. The choice amongst these is based on the usual considerations of functionality, performance, flexibility and ease of programming.

11.2 WHO PAINTS THE BITS?

One way of looking at the partitioning of a window system is by asking the question Who paints the bits?

11.2.1 The Kernel

Many operating systems force at least the graphics primitives to be in the kernel since client processes are not able to access devices directly. In operating systems where client processes can access devices directly, like 4.2 BSD Unix systems, the kernel can be completely separated from graphics operations.

Nonetheless, the kernel is often a convenient place to put device support. It also provides a centralized synchronization point. But it is susceptible to all the usual problems of placing anything into the kernel: the debugging tools are usually wretched at best, any bug that does escape detection threatens the integrity of the entire system, development cannot proceed in parallel with other uses of the system, and it usually commits a large body of code to wired-down memory.

What often happens in such systems is that only those functions which must be done by the kernel are done by the kernel. Graphical operations and synchronization are usually in the kernel, while window management is not. There are special kernel calls that only the window system uses that manipulate the kernel's data structures. These data structures contain such things as the window boundaries that the kernel must know in order to perform the graphic operations. This organization is illustrated in Figure 11.1.

Figure 11.1

11.2.2The Client

On systems where client processes can directly access the display hardware, one very seductive organization is to place all of the graphics code in each client process. Implementing this involves duplicating much of the code of the window system in each process. This gives clients the most direct possible access to the display. Not even a kernel call is required. The usual motivation for doing this is to provide the maximum possible performance. This scheme is illustrated in Figure 11.2.

Figure 11.2

Often the expected performance does not appear. There are several possible reasons:

Synchronization: the clients must synchronize amongst themselves for access to the display hardware. It is usually impossible for two processes to be accessing the device registers in parallel. For example, before drawing a line a client must make sure that no other client is drawing lines and only when the hardware is idle may it finally draw the line. This checking and locking can be very expensive, sometimes as expensive as a kernel call, and can overshadow the expense of drawing the line. This expense can be reduced by increasing the granularity of locking: instead of locking on every line, lock before drawing a group of lines, then unlock afterwards. This has the disadvantages of being error-prone and of exposing locking concerns to higher levels of the system.

Some displays, however, do have special features that allow multiple processes to access them in parallel. One way of doing this is by having the hardware support multiple request queues and assigning one queue to each client.

Synchronization in the form of agreement about window layout also needs to be dealt with. When a client is drawing a line, other clients must be prevented from moving their windows and affecting other clients' clipping. This also implies that clients cannot precompute clipping information outside of the locked regions of code.
Paging: if the hardware provides little support for graphics operations, as many simple frame buffers do, then the graphics library that is replicated in each client can become large. If there are many different display devices and operations to be supported then the amount of replicated code can become enormous. In systems with virtual memory, this can cause substantial paging delays.
Attempts to keep code small: if a large amount of code is being replicated and its sheer bulk is causing problems, attempting to keep it small appears attractive. But this often involves exploiting fewer special cases and avoiding other optimizations that make operations faster but larger.

Putting such a large body of code into each client process also introduces logistical problems. It becomes much harder to make changes since every client must be relinked to access the new routines. This has a strong impact on bug-fixing since it takes quite a bit of effort to propagate the fixes. It also hurts device independence since each new display device forces a complete relink. New display devices also increase code size which has interactions with paging behaviour and hence performance.

Such an approach also requires either that client programs are well-behaved or that the hardware is sophisticated enough to cope with those that are not. For example, if the hardware doesn't provide clipping or if clients are able to change their clipping boundaries, then a client which runs amok can destroy the image on the entire display, not just within its window.

11.2.3 A Separate Process

Parts of the window system may also be placed in a user level server process that is not a part of either the kernel or the client. Operations are performed by sending messages to the server. Often, at least the window management functions are done this way, but the graphics library may be done the same way. This technique is dependent on the existence of an inter process communication (IPC) mechanism. When IPC is integrated with networking, as it is in 4.2 BSD Unix systems, it allows the window manager and its clients to be distributed across many machines. This scheme is illustrated in Figure 11.3.

Figure 11.3

Under this scheme, all of the graphics and window management code is placed into one process. The window layout database, clipping regions and all other relevant information is centralized. It solves most of the problems of the other organizations:

The synchronization issue is solved by sidestepping: the window system has only one thread of control and complete access to all information. Synchronization occurs by the act of serializing the messages coming into the process.
As a user level process it is much easier to develop and maintain. It doesn't threaten the integrity of the whole system.
It provides a good basis for device independence. To achieve independence one must be very careful in specifying the semantics of the messages that get exchanged.

Passing messages can substantially increase the overhead of performing operations. There are a number of techniques for reducing the cost:

Make sure that the basic IPC mechanism is fast. Shared memory in non-networking environments works well.
Batch requests. If a request to the window system doesn't require a reply (like draw a line) then it can be batched with following requests into a single message. With a large enough message size and a protocol specification that requires few replies, the per-message setup cost can almost be made to vanish.
It is important to keep low the ratio of bits passed in messages to bits altered on the screen. This reduces the cost per pixel of message passing. A very good way of doing this is to design a protocol which deals at a fairly high level of abstraction. For instance, it is perfectly possible to design a protocol that allows only bitmaps to be sent to the window system from the client. Thus, when a client wants to draw a text string, all the bits for all the characters must be sent. In this case, the window manager has a complete, simple model, but it will have poor performance. On the other hand, if the protocol includes notions like font and string, then text can be shipped down in a very compact form. The same goes for circles, arcs, filled polygons and many other operations: the more that the window system understands at an abstract level, the better the performance of message passing will be.

CONCLUSIONS

Window systems can be partitioned by placing the bulk of their code in either the kernel, each client process, or some separate server process. It is important to observe that for all three partitioning schemes some information must be passed. There are processes that are trying to cooperate, not blindly ignore each other. This cooperation has a price, and that price is the passing around of information. Window systems that are built with each client independently performing graphics operations are often seduced by performance promises but often don't achieve them due to unexpected synchronization problems. The centralized server technique solves most of these problems with an added message passing cost, but this cost can be substantially reduced.