Chilton::ACD::USA Visit: August 1978

USA Visit, August 1978 by Bob Hopgood

1. Introduction
2. IMLAC
3. Honeywell
4. Parallel Processing Conference
5. Computer Image
6. Salt Lake City
7. Livermore - Graphics
8. Xerox PARC
9. Livermore - Mainframes

1. Introduction

The purpose of the trip was mainly to visit some establishments involved in research which was relevant to the Distributed Computing Programme of SRC. However, as Len Ford was also visiting SIGGRAPH, it was decided that we would spend the first two days in Boston visiting PRIME and IMLAC. The visit to PRIME was well worthwhile and we were able to see the great expansion that has taken place over the last year and also were given, in confidence, a great deal of information concerning future plans. This is written up in a separate paper which has limited distribution.

2. IMLAC

Unlike PRIME, there was very little difference in the IMLAC site from last year. The extension being made to the assembly area had been completed but that is all. It is still a very small company with a turn over of a few millions of dollars selling about 100 to 200 systems a year. We only spent half a day there and concentrated on the GINO/DINO issues associated with the 3205 display (the replacement for the PDS4).

David Luther had joined the company as head of marketing. He has a long experience with graphics in the US Forces environment and was a large user of IMLAC equipment. He seems to have been the driving force behind the efforts being made by IMLAC to rationalise their products. He was most disturbed by the vast number of special systems being marketed by the company. This meant that it was difficult to make systems ahead of orders. It was, therefore, difficult to purchase in bulk and there was a long time period between order and availability (at least 120 days). The aim has, therefore, been to rationalise the products. Most of the add-on peripherals will no longer be available (discs etc). The systems will look more like terminals and there will be less variety.

The company is interested in expansion especially as the low cost of the PDS4 replacement (3205) will mean the need for a higher volume production to achieve the same profits. There is a strong possibility that the company will be purchased by Hazeltine in the near future. Both companies are interested in the graphics area, their products fit quite well together (Hazeltine concentrate mainly on VDUs and raster displays). It is likely that this will have a significant impact on the way the company is run. It will give them the capital necessary to expand and the management expertise involved with high-volume production. The price of the 3205 is competitive but it is likely to be re-engineered to lower the size of the processors in the display and generally improve the cosmetics of the product.

The main problem expected was that the company had decided to go for keeping the 3205 program in PROM and to have only one possibility - a DYNAGRAPHIC program. Consequently, GINO could only be run if the DINO was provided as an alternative PROM or the GINO backend could be matched to the DYNAGRAPHICS system. The documentation we had obtained in the UK was rather poor concerning DYNAGRAPHICS and it was unclear whether it was possible.

We managed to obtain a full description of the DYNAGRAPHIC system (Len has two manuals) and most of the major problems look as though they can be resolved. For example, dragging was supported, segment visibility could be changed easily from the host, anchors did exist for segments etc. There were still a number of areas where the mismatch is significant (environment, renaming etc) so that it is not a sensible way to go. However, the DYNAGRAPHIC system looks quite good and it is clear that a large amount of software will be developed for it in the USA and this software will be in the engineering area and available on PRIME computers. Len should be able to get additional information at SIGGRAPH.

There were a number of changes to the 3205 from the details that we had been provided with in the UK. The major one was that the program was not being executed out of PROM. There will be a 3-position switch on the keyboard which will allow you to boot one of three programs from PROM to main memory where it will be executed. The first program is a machine EXERCISER to give you confidence that the system is working. The second is DYNAGRAPHICS and the third is available for, most likely, the Tektronix emulator.

After much discussion, it became clear that IMLAC would have no objections to a user mounting his own program in the third position - the only condition is that all three programs are less than 16K in total. On the question of engineer's tests, some of these will be booted down from the main machine and, in fact, DYNAGRAPHICS, on start up, looks like a teletype and the modification necessary to have a particular control key cause a down-line load appears to be quite simple. It was agreed that IMLAC would make the necessary changes to allow down-line loading and the decision would be left until later as to whether it is necessary to burn DINO into a PROM. We stressed that the initial terminal input for the down-line load must be echoed on the display.

Speeds available for communication are 110, 300, 1200, 2400, 4800, 9600 and two positions on the switch will be unmarked in the USA but will provide 75/1200 and 19200. There is currently no support for a tablet although they are looking at a 11 in square Summagraphics. Either a lightpen or joystick is available but not both. There are 16 buttons on the keyboard of which 8 light up. It would be possible to provide an orange phosphor although they would be reluctant. IMLAC have agreed to provide us wih a quotation committing them to the down-line loading and offering both phosphors as alternatives.

The engineer will have a 2400 baud cassette that can be plugged in for testing. The table has been changed and is now 3ft square. This gives more room at the front. The wood edging has gone. The electronics cabinet for the processor is still large. The display looks a lot cleaner than the PDS4. The tube appears to be quite a bit flatter. They have finally put a lightpen holder on the side.

The meeting turned out to be a lot better than expected and there seems no reason for not purchasing at least one system for evaluation. Delivery could be as early as November but it depends quite crucially on the number of orders coming in.

One point of interest was that we saw a new magazine there called Computer Graphics and is published by Cygnus Communications Inc. It had an article on how to use the AP-120B foe graphics.

3. Honeywell

The plane I had been booked on from Boston to Minneapolis had been on strike for 180 days and would not be resuming until October. I finally got there via Kansas City after getting up at about 05.30 in the morning. The Embassy had booked me into a hotel about as far away as you can get from Honeywell. I had to take a limousine downtown and a cab to get there. The cab driver lost his way! As a result of all this, I arrived there about 20 minutes late. I had agreed to give an informal presentation of the SRC Distributed Computing Programme to Doug Jensen and one or two of his colleagues. When I arrived, there was a complete lecture room full of people from Univac and Honeywell expecting me to give a one hour plus presentation. Adding as many stories of SRC politics as I could remember, I managed to make it last until about 11.30.

Colin Whitby-Strevens and David May from Warwick had both been to Honeywell this summer so that they already had some idea of the SRC programme. The main reaction I had was that they were surprised at the amount of interest in Data Flow. I also found out that the Green Ironman proposal had been lead from Honeywell Systems and Research Centre and so they were very familiar with that. Incidentally the place is named S&RC which was confusing at times.

The main distributed computing work at Honeywell is the HXDP multi-processor system designed for real-time control applications. Probably the most up-t-date reference is The Honeywell Experimental Distributed Processor - An Overview in IEEE Computer, January 1978. Two systems have been built, a 2-processor and a 3-processor. The 2-processor system has been moved to San Diego to be used in a Naval application while the 3-processor system is still at Honeywell. Each processor is currently about half a rack of equipment. The computers are standard off-the-shelf 16-bit machines but the Bus Interface Unit is their own design and is the major part of the electronics. The system has a number of probes which inject errors into the system to check reliability etc. Apparently, the system is too reliable and they get insufficient errors just waiting for them. The 3-processor system is to be shipped to San Diego in October or November to make a 5-processor system there. One member of the staff will go with it to carry on experimentation. The intention is then to build a new system. The design will be completed by the end of 1979. Instead of a single bus there will be 4 for reliability and bandwidth. The buses will be fibre optics running at 10 Mbits/sec. Although the applications are unlikely to require that bandwidth, the additional bandwidth can be used for systems communication. They felt that it was wrong to design a system whose constraints were put on process position by limitations in bandwidth. The Bus Interface Unit (BIU) will be more complex and is likely to have the ability to broadcast messages although they are still unclear as to whether this should be a primitive function at the lowest level. They do see a requirement for a message passing primitive when the receiver requests the sender to send a message. They firmly believe that buffering is an essential feature of a distributed system down at the primitive level and were not happy with the Hoare approach. The BIU is likely to have about 500 ICs and be about 4 times the complexity of the existing one. They would hope to have FTP supported by the BIU and it would be a great deal more soft than the current BIU. They would hope to have a 5-processor system running by 1980. The Executive Kernel for the operating system will reside in the BIU rather like the PPUs on the CDC 6600.

One other machine under production is a 5-processor simplified version of the HXDP which is being put together for another Naval application. This system will have about 50-80 processes. Although they would eventually aim for systems with one process per processor, they still expect in the next few years to have many processes per processor and their original design which only allowed 8 processes per processor was a mistake. They still hope that no two processes which conflict will be allocated to the same processor.

S&RC get a great deal of their funding from outside Honeywell which allows them to be independent of any demands from Honeywell. US Government funding falls into a number of classes:

6.1:Pure Research - money given with no strings
6.2 Applied Research - the area of research is identified
6.3 Advanced Development - working towards an end product, a result is expected
6.5 Pre-production

S&RC gets 75% of their funding from the government. Most of it is 6.2 money with some 6.1. (Places like CMU manage to get large 6.1 grants - several million dollars per year.)

Doug Jensen is currently in the process of taking up a 50% appointment at CMU and will be commuting with a large proportion of the 50% in one term and odd weeks through the rest of the year. He has a great deal more sympathy for CM* than HYDRA or CMMP. He feels the distribution of operating systems functions throughout the system is important.

A brief history of the distributed computing applications over the last 12 years at S&RC is:

PEPE System: designed at bell Labs but built at Honeywell
All Applications Digital Microprocessor: a large scale government-funded 6.2 project with a great number of companies involved. It was supposed to be a CMM.P machine (but several years earlier). It was supposed to run at about 6 MIPs with an APL directly executable order code. The LSI technology was custom made for the project. The whole system was controlled by Ron Etna of the Naval Ocean Systems Command. The processors were designed by Raytheon. It had shared program and data with private cache memories for each processor. The machine was a true 32-bit machine with 32-bit wide buses. Honeywell was awarded the contract for the Interconnection hardware and the Executive. The first was a good product but for various reasons they were not happy with the second (the work was done by a newcomer to operating systems design who reinvented algorithms which had been published and were known not to work). The I/O Processor was done by IBM. The assessor to the project was Dave Parnas who was instrumental in stopping a project which had got completely out of control. Etna was fired. $2M was spent in manpower alone over the 5-8 years of the project.
Modular Computer System: this was a prototype for HXDP. It was just a paper design.
Distributed Processor/Memory: this was a paper study contract for a real time control system which included both the processor design and the interconnection structure. There were local EXECs with a global EXEC that floated around the processors. Texas Instruments was awarded the contract to simulate the system. They did this in SIMSCRIPT and changed the design to some extent. By this time the Air Force was beginning to doubt the wisdom of producing its own processors rather than buying them and the project was dropped.
HXDP: funded 50% by Honeywell and 50% by the Navy. The real-time area was considered of interest because however little you did in terms of distribution it would be useful and, however much you did, there was always the opportunity to include more.
Characterisation of Selected Data Multiplexing and Distributed Processing Systems: the Navy awarded the contract because they were getting many different systems put to them for purchase. They already had standard bus components (for example, SDMS, Shipboard Data Multiplexing System) and they were unclear as to why you could not just connect processors together with these. They wanted a check list to be used to evaluate and highlight the important features of a system.
Fibre Optics Based Distributed Processing System: with the high bandwidth available, they were required in the contract to indicate how the bandwidth not required by the application could be used. The study considered examples where the bandwidth was 20 times that needed by the application.
Decentralised EXEC: this was a sub-part of the HXDP project and has been progressing very slowly. It was internally funded. Do far the kernel has been partly coded and that is all. All EXEC functions are handled by messages. It sounded very much like the GEC 4000 Series EXEC. I promised to let them have details of the GEC 4000 hardware.
Distributed Processor Systems Design Methodological Solution Space Characterisation: again this was a report to elaborate the important points to be considered. For example, what is needed in the area of communications protocols, message addressing (when does binding take place), software interfaces and the kernel's view of the hardware etc. All the issues are raised with possible answers. Again it provides a check list for system design. This project is not yet finished.
Distributed Processing Rationale: unclear what this was about but I have a Report.
HXDP Experiments: this is a separate contract and is designed to evaluate the success of HXDP. It mainly checks low-level functions like fault-tolerance etc rather than how it performs in an application environment.
Characterisation of Bus and Loop Control Mechanisms: a contract to study how allocation of bus resources should be handled. Suppose we have a number of processes wishing to use a bus and we need to allocate, say, 29% of the resources to process A, 11% to process B etc. Each process has a vector of 0s and 1s associated with it. A pointer moves down the vector and only one process has a 1 in each position indicating that it has use of the bus. The vectors are made up so that process A's 29% is scattered through the vector. There are a number of strategies for producing the vector depending on the characteristics of bus accesses. Also, they are exploring the possibility of dynamically changing the vectors (this work is being done in association with IRIA).
Integrity for Bus-Structured Distributed Processing Systems
Decentralised Resource Management: both projects are in an early stage of development. They want to develop strategies for error detection and to define what decentralisation means.
Modular Missile Borne Computer: this is the next US weapon defence system. The intention is to have a set of distributed processors on every guided missile and these will be coupled to a synchronous satellite network providing them with long range radar information. The missiles can be thought of as nodes in a complex communications network where the topology is changing (ie which missiles still exist and are in contact range). It is possible that each missile will have 35 16-bit processors with a processing power of 50MIPS. The missiles talk to each other concerning strategy, tactics etc.

Doug Jensen was interested in either me or somebody else from the Distributed Computing Programme acting on the Programme Committee for the First International Conference on the Design of Distributed Processing Systems to be held from 3-5 October October 1979 in Huntsville, Alabama with tutorials before the Conference on 2 October. The conference is sponsored by the US Army Ballistics Missile Defence Advanced Technology Centre at Huntsville, IEEE and IRIA. There would be a special issue of IEEE Transactions on Computing which would contain the best papers.

He wondered if SRC would act as a UK sponsor. It would entail doing the PR work for the conference - local advertising etc. I agreed to pass information about the Conference around the interested groups in the UK. Main dates are:

Call for Papers: Oct/Nov 1978
Papers due: 15 April 1979
Maximum number of delegates: 1000
Chairman: Charles R Vick (BMDATC)
Vice Chairmen: Raymond Yeh, Gerard Le Lanm, Hideo Aiso
Programme Chair: Doug Jensen

There might be some funding for speakers. Emphasis is on System Wide Control and Decentralised Operating Systems.

Doug Jensen will be in the UK in the middle of October for an Infotech talk and again in December. I agreed to get the DCS people together so that he could get away with giving one or two talks only once.

4. Parallel Processing Conference

The conference was held at an expensive Hilton hotel miles from anywhere. It was designed mainly as a ski resort so resembled Blackpool out of season.

The conference had a heavy influence towards hardware architecture and design rather than software. I will add the abstracts of the papers to this document. The main point that came over was that you could not define half the architectures as loosely coupled or tightly coupled nor could you differentiate between SIMD and MIMD systems. Some of the architectures allowed you to work in both modes and dynamically changed from one to the other.

It was very evident that nobody had any real feel for what was a good general purpose architecture. However, there were a number of papers aimed at quite specific architectures for particular problems including finite element analysis, data base manipulation, compilation etc.

The main machine that seemed on the horizon was PHOENIX which is being defined as a multiprocessor system for the government by some agency. The major fault with the conference was a lack of Proceedings which will not be published for another four months.

Major papers that interested me (some more details in my notes):

Smith: US Army machine which was a mixture of Data Flow and Von Neumann with modifications to FORTRAN to define the Single-Assignment Variables.
Kim: an interesting presentation on Fault Tolerance Computers with Recovery Blocks etc a la Randall.
Rothstein: a character who talked about bus automata to solve all the known problems in the world. He thinks neurons are made out of bus automata.
Timsit: a French company ADERSA is marketing a multiprocessor machine called PROPAL II for about £75K. Speed is 2 to 7 times CDC 7600 on problems.
Vanaken: a tree of processors to act as a pipeline for arithmetic computation.
Jordan an FE machine.

5. Computer Image, Saturday August 26

I spent Saturday visiting Computer Image in Denver. I was looked after by Hal Abbott who is Vice President in charge of Production and is an ex-Los Angeles film person.

We spent quite a bit of time going through many reels of animation showing the breadth of work they now do. They have come a long way in the last four years. The last time I was in Denver they were having a hard job to find work whereas they now have a fully booked schedule for CAESAR until mid-October already. They seem to be doing a great deal of business with Australia and South America. Most of it is still TV Commercials but there are a number of other interests. For example:

In 1974, a Pan Am jet crashed in Pango Pango killing nearly everybody. The original court hearing gave the reason as pilot error. Computer image as part of the Defence produced a film showing the down winds in tropical storms. Apparently the plane hit the storm just as it was about to land. Computer Image think that this is the first time computer animation has been used in a courtroom.
They are currently negotiating with ABC News to produce animated weather maps with some very realistic cloud formations. Each day Computer Image will get the forecast about midnight. they send the animation by satellite to ABC to arrive by 04.00am and it goes out at 06.00 every morning in the early news programme. The cost of the satellite time is currently prohibitive. ABC will, however, have their own satellite next year.
They make government commercials as fill-ins for the Navy. These are shown on-board ship. I saw one depicting Custer's Last Stand with a young officer telling Custer that he really should persuade all the men to write a will. They take about two days for 6 minutes of animation with lip sync.
A number of teaching films showing army recruits how to dismantle particular machines (like early Fleischer).
They are hoping to produce an animated cartoon rather like the static political ones in the daily papers. Each day, a topical storyboard will be worked out from the day's news and they have to get it finished during the day! I saw a beautiful one of President Carter Singing in the rain with his umbrella always in the wrong place and he eventually drowns.

There has not been a great deal of change at Computer Image, they still only have one CAESAR machine and one SCANIMATE. The major changes to CAESAR have been in the remote interface. They have developed a console which can be used at a distance. It has all the functions on the main machine and artwork can be entered either centrally or at the remote console. They have upgraded their video equipment. Film output is produced using Image Transforms Inc who did Star Wars.

They have sold off the switching side of the company and this has released a number of R&D personnel to work on a CAESAR replacement which will also include a SCANIMATE. the machine used will be a Texas Instruments computer. They have a number of frame buffers which will be added to the new system and it will be possible to transmit artwork from site to site.

For filler material, like the low-cost Navy films, they charge as little as $700/min whereas their commercial rate is $6200/day. they produce between 30 secs and 1 min of film per day so their prices are in-line with that of the industry.

They are just producing a new reel of film containing a number of clips and have promised to send me a copy.

6. Salt Lake City, Monday 28 August, 1978

The main reason for the visit to the University of Utah was to see Al Davies who is funded by Burroughs to develop a Data Flow Machine. My visit coincided with visits from Dan Friedman of Indiana and David Dahm from Burroughs. Consequently, I had less time with Al Davies than I would have liked. Instead I managed to find out a great deal about operators such as fons in LISP! Friedman is a LISP freak and a bit hard going at times. He promised to send me about 300 papers.

The motivation for the Data Flow Machine is eventually to produce a stand-alone product. The main programming features they were looking for were:

Distributed Control
Problem Dependent in Structure
Verifiable
Parallel
Nice for the programmer, not the machine
Data driven

These have all been included in a programming form called DDNs for Data Driven Nets. These are similar to the Standard Data Flow Graph. In particular, recursion is allowed.

The major motivation for the hardware was

Support parallelism
Such support should be both horizontal, that is splitting the computation to allow parts to be done in parallel and pipelined where several computations can flow through the pipe one after another
Distributed control - no module can enforce synchronous operation in other modules
Extensibility - and this implies no tuning which is hardware specific and implies small modules
Asynchronous environment
Recursive
Cost Performance Right - implies that it is constrained by the standard LSI constraints in terms of pin connections etc. Also must aim for single chip and high volumes of these

The basic processing unit, a PSE, consists of these main parts:

AP: Atomic Processor with input queue (IQ) and output queue (OQ)
ASU: storage
MAPPER: speeds up access
8-way switch

The AP processes a part of the data flow program. It receives data on the input queue and returns results on the output queue (OQ). It may send work to be done to at most 8 low-level processors.

Storage for the AP is contained in the ASU which is a complex structured storage device. This was mainly due to earlier tests showing that 30% of all work done by a system is in storage accessing and management.

The Processor accesses items in the ASU by functions such as DELETE, ASSIGN, READ etc where the normal address is replaced by a name or a tree-structure position. For example, 1-3-3, ie 3rd node of 3rd node of 1st node. The ASU does its own management of free space and tells you when it is full.

The MAPPER is an add-on extra which speeds up accesses to the ASU by keeping a defined level of the top part of the file tree in order to make tree accessing that much quicker. At the moment, you set the level (currently 3) in the hardware. It is likely that there will eventually be a programmable option. In general, the MAPPER will dynamically update itself. Sometimes, it may be necessary to ask the ASU for help. It improves storage accesses by a speed of 8 to 12. The current MAPPER is reasonably simple. A more complex version would be considerably more expensive.

Using current LSI technology, it is possible to put the AP, 4K ASU, and MAPPER each on a 40-pin 8 micron package. With 2 micron technology, all three can go on a single package.

Some PSE will require more storage and it is possible to increase the size up to 128K.

The 8-way switch is currently a problem as it requires 114 pins and even a 4-way switch would require 62 pins. The decision has been to go for a cascade of 2-way switches (38 pins). In fact it runs at the same speed and, in terms of responses, it would probably have to be done that way in any case as 2:1 arbitration is easy to do.

The running of a DDN program is done by the program arriving at the top PSE and it decides whether it is capable of executing the whole program ( it will be run serially then) or sub-parts may be passed to lower level PSEs. Execution parallelism is achieved by having parallelism in the program and pipelining. To make decisions on partitioning, the program carries with it information concerning the total storage requirements and some information about program structure. Even so, it is possible for a low-level PSE to execute a piece of the program which, due to variable length data items, it cannot handle. In this case, it asks its higher level PSE to hold parts of its data so that it can carry on. The top PSEs tend to have more storage, therefore, than the lower ones.

There is a loading task that has to be completed before an arbitrary DDN can be executed. It has to be mapped into a hygenic form.

First, cycles must be identified and then attempt to allocate these to one PSE. Secondly, remove transitive paths. For example, a program that goes from A to B to C with another path from A to C would be transformed into a program with two paths from A to B and then two paths from B to C.

This net is then transformed into a SP-GRAPH (Series Parallel) in one of two ways. Either you aim for minimum work or minimum time (the latter may force you to do the same computation twice to increase the asynchronous behaviour of the program. The production of the initial SP-GRAPH is done at compile time and its formatting for minimum time or work is done at run time. It is possible to transform one of these into the other and vice versa.

Consider the following DDN:

DDN

The least work algorithm forces synchronous calls, if they do not exist, to appear at the interfaces 1 to 4.

DDN: Forced Synchronous Interface

The least time algorithm is to start at the end and work backwards finding the dependencies at each level. There is an assumption that the times for each function are equal:

DDN: Least Time

The allocation of the SP-GRAPH to specific processors then follows. Each FORK should now have an equivalent JOIN and, if not, this can be achieved by inserting Dummy nodes:

DDN: Adding Dummy Nodes

The FORK and JOIN nodes are allocated to the same PSE. Thus the top PSE might execute the A node on the way down, sending off the rest of the computation to other PSEs and, when they are complete, it will execute the F node. Similarly, the next lower-level PSE might execute Z and E sending off D and B for sub-PSEs.

B and C would most likely be executed by the same PSE. It is possible that, eventually, each PSE will have two processors, one doing the forward processing and the other the backward. In which case, B could be allocated to one and C to the other.

Currently, a single PSE is in existence made out of CMOS hardware. It is connected to a DEC 20. Bob Barton, who started the project, insisted that all projects should not be influenced by earlier hardware and so should build everything from scratch. Would you believe they had to build a disc controller and a terminal!

Programs are down-line loaded from the DEC 20 and the top PSE starts some computation and attempts to pass sub-parts to the lower level PSEs for execution. These requests are fielded by the DEC 20 and, when the top PSE is held up, the DEC 20 reallocates it to the other tasks. Consequently, the whole of the program is executed on the single PSE under the scheduling of a DEC 20.

Programs are written in a low-level machine code form:

((20)(40,50,60))

or an assembly form:

ADD INPUTS (+2) OUTPUTS (XYZ)

or a graphical form (still under development on a Tektronix 4014).

They are reasonably happy that the PSE is functioning correctly. They have quite a bit of hardware monitoring the PSE at the moment. The information passed at the moment is in a character mode and they are redoing the hardware to pass file structures around. Once they are happy with this, the intention is to make a 200 processor system by designing their own chips. the file oriented data system should be about 100 times faster than the current one which is really slow. They have used CMOS hardware because of its resistance to noise which tends to plague prototype systems of this type.

I have three detailed papers describing the system.

I also talked to Bob Keller who is also interested in Data Flow but for a LISP machine. He sees the system as a tree structure with processors at the leaves and the nodes of the trees providing communication paths. For N processors, this gives, at most, 2 log N nodes to go through. The processors will have a single large shared memory.

The system is demand driven rather than data driven. Due to the LISP background where functions such as CONS are not executed until needed, this is a more sensible approach Than data driven which, in LISP, would cause a great deal of unnecessary and possibly undefined computation. The application area of interest is symbol manipulation. The work is still at a very early stage and all they have is a paper machine with no working simulator.

Several people from the two groups are likely to be in Europe for one conference or another during the next year and I invited them to come and talk to the Distributed Computing community of SRC.

7. Livermore - Graphics

I spent my first day at Livermore talking to the Graphics Group and looking at their equipment, in particular the DICOMED. The machine is quite impressive. It is well engineered with a filter system that appears to be made out of steel for stability. The machine differs from the FR80 in that the filter system sits between the tube and the camera. It has both primary and secondary filters on a wheel. Livermore do not generate primaries on their FR80 by putting in two filters as they assured me this blurred the image. Instead, they draw the vector with one secondary filter in and then draw it again with the other secondary in place. We should try this.

The DICOMED allows you to map the intensity levels for a specific intensity on to an internally-defined value. The relevant table look-up can be loaded by the user. This allows the user to compensate for one colour compared with another and can also ensure that a set of intensities with equal intervals between them have equal differences in intensity. The DICOMED is driven by a VARIAN and has a Hewlett-Packard electrostatic display as a monitor. Livermore tend to use the FR80 displayer format as input on the DICOMED. It is faster than the FR80, especially when the picture contains lots of short vectors. Can be as high as a factor of 8 although 2 is more likely.

The DICOMED uses an ACME camera which looks better than the FR80 ones. It has a dual movement and two registration pins. They had a 4in square colour plate film camera which produces far better output than the 35mm.

Livermore now have three FR80s, one colour system and two black and white systems. They have thrown out all their Kennedy and Pertec tape decks and replaced them by Ampex TMA8s. These certainly look more robust. Apparently, the interfacing is almost equivalent. They have had trouble with their hardcopy camera which lets in light. However, as they run their FR80s with the doors removed, I am not surprised. They have worn out several movements on their microfiche camera.

The white phosphor FR80 can be run with 1-hit for colour output. The tube has lasted a year. They run it at a filament voltage of about 7 volts. Their hardcopy paper is considerably better in quality than ours. However, they need 8 to 12 hits to get any output using it. They use a Kodak 009 black and white film which is considerably better than DACO A. I am trying to get a roll of film from them to try.

As well as FR80s, they also have two Honeywell 18,000 lpm Xerox-type printers which can generate graphical output. They also have a colour Xerox and several Versatek printers. they produce engineering drawings using the Versatek.

The main interactive terminal used at Livermore is a standard VDU for text I/O together with a video output system. They have a video disc which can store 1024 lines for 96 separate channels. On an unloaded system, a user can get 10 frames out per second. Conversion to raster output is done by the 7600. The video disc does have a hardware character generator but no vector generator. The terminals are connected to the system via coax. They are aiming for 800 monitors of this type. They have produced a cursor input device which locally inserts the cursor marker into the raster display as it is being drawn.

The standard site system is still four 7600s plus two STAR-100s. Bulk storage is provided by an IBM Photo Store and a CDC Mass Storage Unit. It amazes me that either works! The IBM Photo Store keeps the pieces of film in little boxes with lids on. The relevant box is sucked by a pneumatic process to the read head. The lid is taken off and the film removed for reading. The CDC store consists of rolls of 2in wide magnetic tapes in small cylinders.

They have an AYDIN frame buffer for war games. I think they would have preferred to wait and get one of the new systems. I promised Sara Bly a copy of the Raster Display Survey.

Nelson Max has joined the group and continues producing films of space-filling curves. They are now 3-dimensional. He also has a solid ball highlighted molecule film which is out of this world for quality. Len should have seen it at SIGGRAPH. I am trying to get a copy before I leave.

The major trouble with Livermore is the incredible security. They had to have the whole place cleaned up before I was allowed in. All three FR80s were turned off. No output was visible anywhere. If I succeed in getting the roll of Kodak film (blank) or the Nelson Max film out, it will be a miracle.

Livermore have the Knowlton Molecule program nearly available for distribution. They also have a slide generation program which looks good. The new Archuletta package is unlikely to be available before the end of the year. At least it is going to run on the Varian so there should be less problems in moving it.

8. Xerox PARC

Xerox's Palo Alto Research Centre (PARC) is situated just outside Palo Alto in the foothills and is in a very nice setting. I spent most of the time talking to martin Newell, Schroeder and another person who was in charge of graphics developments.

The most famous of the systems developed at PARC is the Alto personal computer that everybody, including secretaries and typists, have in their office. These are connected together via Ethernet. This uses a coax cable less than 1 kilometre in length, which can transmit at a speed of 3.2 Mbits/sec. Various equipment is attached to the coax using a connection called a stinger which can be attached to the line anywhere. Attached to the stinger is a Transceiver and connected to that is either an Alto or a file server, Xerox printer etc. The Transceiver listens to the line before transmitting. This tends to make collisions on the line rare. The most connections they have on an Ethernet at the moment is about 115 and in a 1-minute interval there is rarely more than 10% of the bandwidth used. Overall it tends to be about 1%.

The maximum connections possible on an Ethernet is 128 (just due to the length of the name field). Currently Xerox have about 10 ethernets connected together by gateways. The gateway is also used to connect phone lines. The main servers on each ethernet are Xerox printers, a file server and a name look-up server. The latter is needed as all destinations are given by name and translated into a specific device on a specific ethernet by the look-up.

The standard Alto consists of a 400 by 600 (looks A4 size) video terminal together with a cartridge disc and 128 Kbytes of memory. The disc holds about 3000 pages of 512 bytes. Users can have additional discs although, in general, this is not needed as the information that overflows the disc is usually kept on the fileserver. The local Alto computer keeps a bit map of the display in memory so that all character fonts are equally available. The display is very crisp and clear and very good to use in a text mode. The user has a mouse and 5-key piano-style set of buttons.

PARC's main role in life is to try out ideas and, if they look successful, they may get taken over by some other part of Xerox as a product. The R&D does not seem to have any real immediate products in view.

Major utilities that have been developed for the Alto include an excellent Editor, Page Assembler, Message System, File Transfer etc. The bad points are that they all seem to have been developed independently so that they all have different screen layouts. You seem to have to use the file transfer system to get anything from one system to another and that tends to be slow as the disc chugs away moving the information around. The system, apart from the display, is not good in the sense that response is poor for anything that requires remote file access.

Most of the software has been developed using BCPL although they have recently gone over to a Pascal-based language called MESA (I have a manual).

The Message System, Laurel, is quite impressive and obviously used. The system keeps a list of all your messages from other people. These seem to range from System Bugs, General enquiries about whether you want to attend a certain conference, trip reports, letters from other people etc. The screen is partitioned under user control with different windows containing different items - current message you are looking at, list of latest messages etc.

William Newman and a few others have developed an automated office system which tends to mimic current methods of filing etc. It is being tried out at one of Xerox's Offices and the designers have moved out of PARC to look after the system implementation. This includes typing in the total set of all the files in the Office over a weekend.

The group interested in office automation at PARC are exploring the problems involved with the psychology of interaction. They have also been examining specific offices in detail. What they find is that office procedures change quite rapidly and producing a system that models the office at a specific time is useless as it will be different by the time the implementation is complete. They see the need for having a system that can be changed by the users and this brings up the whole problem of putting this in the right environment for typists etc.

The visit was a little disappointing in that they are quite happy with their ethernet system and so little work is being done on distributed computing. There is some work on distributed databases and I have a report. Unfortunately, the person involved was away at a Conference.

There is a conference on Teleinformatics run by IFIP in Paris in June 1979. This will explore the effect of computing in the office and via TV etc.

The other person I spent time with was involved in the work being done on colour video systems. They have a great deal of equipment produced locally and most of it dates back to 1969. They have a video system which has shift registers rather than a frame buffer. Information is sent to the display in the form the next N bits are all red etc.

They have a good system called SUPERPAINT which allows pictures to be painted on the terminal which is 480 by 640 by 8 connected to a Nova computer. There is a separate Tektronix colour display which contains the menu selection. This allows you to select a brush and paint for drawing. It is possible to superimpose artwork via a TV camera. The system is being used to do all the TV work associated with the Venus probe. They already have a lot of visuals worked out and the TV commentator on the day will be able to change the display etc. The whole system is going to be moved to the TV Studio nearer the time.

They are currently working on a new system which does not have a standard frame buffer. Instead the memory can be partitioned into parts and associated with specific areas of the screen. The logic between the memory and the display will ensure that the images get placed on the right part of the screen. It will be possible for one area of memory to be output in several places. The system sounds quite interesting and should allow immediate movement of information in the X and Y directions with little trouble as all that needs to be updated is the X and Y coordinates that define the picture origin.

9. Livermore - Mainframes

The second day at Livermore was spent talking to the people in the main computer centre and the people involved with the S1 project. The day started with the usual hour delay while they cleared the machine room, put up curtains so that you could not see anything and found a policeman to escort you.

I was given a tour around the 7600s and STAR-100s. The guide said, at one stage, and behind these curtains is a DEC 10. It was not obvious why that was secret as it was only acting as a front end processor for the 7600s.

Sam Mendicino, head of systems, showed me their two mass storage devices. The first was their old IBM Photo Store which I mentioned earlier. It has a higher capacity than the CDC one, one tape being equal to one box of film, but the access is slower. It is suffering from mechanical problems and appears to be kept working by the efforts of the IBM engineers. IBM are about to withdraw maintenance for the device and Livermore will then have to replace it. they have just finished an Operational requirement for the successor and I have a copy of this. they want a replacement for archival storage. In theory, the IBM system can retrieve 0.5 Mbits/sec.

The CDC device, CDC 38500 (the new IBM device is the 3850!), consists of rolls of 2in wide magnetic tape in tubes. Access can take as long as 15 secs but then the whole of the tape is available. It is, therefore, easy to get off a great deal of information once it has been accessed. Livermore are interested in Plessey's holographic store and also a polymer device patented by Leaventhal of Columbia. I got the impression that it knocks atoms out of polymers to store data.

A large part of the morning was spent in me describing the Distributed Computing Programme of SRC. Livermore talked a little about their views on the subject. They really see the problem as one of distributing computer power. They have purchased the Thornton/Jones intelligent bus which allows you to put tapes and discs etc on the bus and access them from a whole range of computers, both 7600s and midis.

As they had not even heard of PRIME, they were talking about VAX machines for the midi role. Basically they want to do software development on the midi systems and batch runs on the 7600s and STAR-100s. They see the central mainframes and midis connected to far midis by an OCTOPORT. The far midis have a slow 1MHz connection to the mainframes while the near midis share discs and so transporting jobs is instantaneous. The old Octopus system is being replaced by NLTS (New Livermore Timesharing System). They hope to have a central Disc farm of 300 Mbyte drives for all the systems. I think PRIME will get a visit from them after I mentioned their activities.

The S1 Project is funded by the Navy and is run by Lowell Wood, an ex-physicist, who keeps their old Stretch in working order in his garage. The aim of S1 is to provide the Navy with a flexible system capable of handling most of their on-shore and ship-board computing facilities in the 1980s. It can be thought of as defining three parts:

A Design System
A Single Processor System
A Multi-Processor System

The machine has a basic architecture with 128 registers, large virtual memory, 36-bit word,9-bit bytes etc. The 36-bit word was chosen to give the necessary address space.

Currently, the Navy uses systems called UYK-7s which run at about 0.5 MIP. The new machines in the S1 project are the Mk 1, Mk 2A (containing increased floating point capability) and the Mk 3. All are to be produced using ECL logic. The performance expectations of the three systems are:

Mk 1: 20 times a UYK-7 which is between 0.25 and 0.5 of a 7600. About 0.3 of a 7600 running Pascal programs. Aim would be 0.5 Mbytes main memory, 3.3 Kwatts power consumption and cost about $80K.
Mk 2A: 50 times a UYK-7 which is 1.5 times a 7600 or 6 times a 7600 if hand-coded. Similar amount of memory but with a power consumption 4.7 Kwatts at a price of $165K.
Mk 3: 200 times a UYK-7. Aim would be 2 Mbytes of memory with a power consumption down to 1.5 Kwatts and price down to $150K.

The whole concept is to allow new models with the same architecture (possibly extending the order code) to be produced in about a year rather than the current estimate of about 5 years. Thus, there will be an ability to use new products as soon as they become cost effective.

The current aim is to produce Mk 2A processors in 1980 and Mk 3 later.

The design system is a top-level structured approach allowing the computer to be defined initially in terms of its register structure then breaking that down to the next level of complexity and so on. Hopefully, the lower levels will use standard products. For example, the Mk 1 has 280 modules of which 130 are standard and could be used in a wide range of applications. 90% of the boards will be directly usable in Mk 2A.

The hardware throughout will be built out of ECL logic. Most of the backwiring will be wirewrap and done automatically. The design package consists of:

SUDS: Stanford University Design System, used just for composing the graphics even though it is a design system in its own right.
M: Macro Expander
R+ECO: this sorts out I/O, Pins, Wiring etc. The ECO part allows you to change the design. It then tells you which pins to disconnect etc
TRL: Transmission Line Analysis

The Mk 1 was designed after using SUDS for just 30 hours on a DEC 10KL, 28 hours of M and 84 hours of R+ECO. The whole design was complete in about 24 man months and it took 3 man months to build the machine.

The multiprocessor architectures envisaged are:

A 16 by 16 processor/memory with a Crossbar Switch in 1979
A 16 by 16 processor/memory using Mk 2A processors in 1980
A 16 by 16 processor/memory using Mk 3 somewhat later

The 16 processors are connected to 16 memory blocks via the crossbar switch.

Each processor has 32 Kwords of data cache and 16 Kwords of instruction cache so that clashes on the store should be infrequent. The IBOX for the machine is microcoded and has sufficient space for the standard machine instruction set, the instruction set of a machine to be emulated and some spare space for diagnostic programs.

Each processor has a diagnostic processor associated with it as does the crossbar switch. The floating point processor in the Mk 3 will also have a vector floating point capability and special hardware for FFTs. I/O is through a 2 Kword I/O data store which allows buffering of I/O at a speed of 540 Mbits/sec. The I/O subsystem looks rather like a PDP11. The cache hit rate on a Mk 1 is about 99%.

They feel that a crossbar switch is the best method for connecting processors of this type and power and it is inexpensive. They estimate the cost of the crossbar as 3% of a 16 by 16 Mk 2 system, which is about 80% of the cost of one processor. Also only 25% of the cost goes up with N². The rest grows linearly with N.

They see larger systems as having one port acting as a gateway between crossbar switches.

A large Mk2 system (16 by 16) will fit in a 30ft by 50ft room (5ft cabinets in an X shape). A ground system might have a power of 250 MIPs while a sea-board or air-borne system would be about 100 MIPs.

The system is designed so that there is no limit on the distance apart that processors can be placed. The cost of interconnecting cable, even spread out over a battleship, is only about 2% of the cost of the whole system.

A large 16 by 16 Mk2 system would cost about $3.5M. It would contain about 4 interchangeable cross-bar switches so that memory, processors and switches can all go down and the system will only degrade.

The systems currently have compilers for FORTRAN, Pascal etc. They will implement the DoD IRONMAN language when it is finally defined irrespective of the parallel processing language constructs available. The project seems totally hardware driven and less thought has gone into producing software which can be proven correct and unable to deadlock. It was difficult even to get them to understand the questions being raised. The hardware is certainly quite exciting. Most of the software has been developed very quickly by the Group at Stanford. However, a little more thought could be put into how to program the systems.

I managed to bring back two films and a DICOMED Manual.