Contact us Heritage collections Image license terms
HOME ACL Associates Technology Literature Applications Society Software revisited
Further reading □ OverviewStatistical FORTRAN programsMultiple Variate Counter program (MVC)ASCOP Statistical Computing ProcedureASCOP inputAnimal feeding trialsBOMM Time Series AnalysisBode's LawLinear and non-linear programmingProblems encountered in archaeologyMedical surveysGeological data banksThe G-EXEC system: DesignG-EXEC: System capabilitiesPreparation of data for analysis by machine
ACD C&A INF CCD CISD Archives Contact us Heritage archives Image license terms

Search

   
ACLApplicationsApplied Maths :: Applied Mathematics and Statistics at Atlas
ACLApplicationsApplied Maths :: Applied Mathematics and Statistics at Atlas
ACL ACD C&A INF CCD CISD Archives
Further reading

Overview
Statistical FORTRAN programs
Multiple Variate Counter program (MVC)
ASCOP Statistical Computing Procedure
ASCOP input
Animal feeding trials
BOMM Time Series Analysis
Bode's Law
Linear and non-linear programming
Problems encountered in archaeology
Medical surveys
Geological data banks
The G-EXEC system: Design
G-EXEC: System capabilities
Preparation of data for analysis by machine

The Design Philosophy of the G-EXEC System

Elizabeth M. Gill and Keith G. Jeffery

March 1976

Computers and Geosciences, Vol 2

Keith Jeffery: Computer Unit, Institute of Geological Sciences

Elizabeth Gill: Atlas Computing Division, Rutherford Laboratory

Abstract

The G-EXEC generalized data-handling system is a computer-based facility with which a user may handle his data. It is generalized so that it handles almost any type of data, and is written in ANSI FORTRAN IV so that it may be implemented on almost any computer. The system provides for data storage, retrieval, analysis, and display together wilh utility software to handle data-base maintenance. The integration of processes, the diversity of data-handling capability, and the user-friendly commands are the key design features.

INTRODUCTION

The authors started work on the G-EXEC computer-based, data-handling system in November 1972. The immediate reason for developing such a system was the simultaneous need for data-handling systems for different geoscience disciplines and the inability of two people to develop four different, specialized data-handling systems in the time available. During the previous year some existing generalized data-management systems were studied but they failed to meet at least two of the design criteria the authors considered essential in a generalized data-handling system. It was on the basis of those design criteria that G-EXEC was developed.

DESIGN AIMS

The first criterion is generalization. G-EXEC has data-independence, that is the software is not dependent on the data, and a data description describes the data to the system (and also to the user). The use of a data description to achieve generality is a well-known technique. It is used by all the main generalized data-management systems and although somewhat disguised under the term data-definition language, forms part of the CODASYL recommendations (CODASYL, 1971).

In geoscience, an early use of the data-description concept (since 1964) is embodied in the ROKDOC package (Loudon, 1974). The G-EXEC data description is more informative than most, additionally containing information on the upper and lower limits of values for each field and an absent-data code for each field to distinguish absent (not determined) and zero (not detected) values.

G-EXEC is also machine-independent. The software is in ANSI FORTRAN IV and therefore is supported on "This paper is published with permission of the Director, Institute of Geological Sciences, and of the Director, Atlas Computing Division, Rutherford Laboratory. almost any computer. This allows the organization, using G-EXEC to change machines if necessary without incurring a huge overhead in program conversion. Most data-management systems are written in a machine-specific manner for obvious commercial reasons.

The whole system was conceived as an integrated whole, and was designed top down and, once defined, was built bottom up. This structured approach is reflected in the standard software design, the standard variable names and modularity allowing each software unit to perform a unit-task; the complexity of the task depending upon the level of the software unit in the hierarchy. The one data structure common to the whole system was recognized to be functionally the processing form and structurally the array or matrix. Consequently, all the software acts upon one or more functional processing form files; this form is the G-STAR standard file. This, of course, does not preclude functional storage or input form files being structurally different from the G-STAR form.

Such a system must be easy to use. The user instructs the system to perform work for him by means of simple commands which activate processing software units which are designed to do a unit of work meaningful to the user. Examples of such units of work are to sort a file, to do simple statistics on a file, or to plot a trend-surface map of a given order using a given three fields of the file as X, Y, and Z values. The user commands are interpreted by a software package called the controller which is the interface between the G-EXEC system (including its software, data and users) and the computer installation. The controller generates FORTRAN and Job Control Language for the host computer from the user commands. The controller for any particular installation is written to take advantage of any display or plotting facilities available and to control any output routing facility.

The users are not the only personnel considered in ease of system use. Software staff time is expensive, and so the integrated approach to design allows easy maintenance, modification, or extension of the system. The structured software modules with standard variable names are the key features.

The system requires a system manager at each installation. He controls the allocation of a storage space for files and maintains the files which interface the controller to the installation. The system facilitates his work by providing him with a journal for every job which passes through the system, accounts on resource usage by each job (together with an invoice generator program) and system performance monitoring information. The system manager also has control of system and file security.

The software is efficient. Direct disk access methods are used whenever possible in the data-handling modules. The use of fixed length fields in processing data vectors ensures immediate access to the particular computer word(s) holding the data referenced.

G-EXEC is flexible with regard to data structure. This desirable flexibility is achieved by storing the data structure of a data base externally to the data files. Thus, the structure can be processed independently of the data, and the structure can be modified without changing the data values. Furthermore, the data structure can be modified without changing the structure of the data values as recorded on the storage medium {e.g., disk pack).

Conventional data-base management systems require the user to access the data base by writing a program in a language such as FORTRAN or COBOL. Some provide a user-orientated query language, but none seem to provide the simple, user-friendly command set and the range of processing available to a G-EXEC user. Most existing data-base management systems have a logical storage structure corresponding to a set of hierarchically ordered records in a file. G-EXEC stores sets of the same type (and containing data values) in files, and so models the relational data-base concept. Furthermore, the use of external indexing of multiple files provides external structural mapping of the database; a facility of importance where several different structural (or conceptual) views of a database are required concurrently.

IMPLEMENTATION

At the Institute of Geological Sciences, London, G-EXEC has been implemented as a system for geologists and others to use in their everyday work. The system is installed on the IBM 360/195 of the Science Research Council's Rutherford Laboratory and accessed over a remote job entry link from a PDP 11/10 terminal in London.

The software totals some 40,000 lines of source code and is split into packages for convenience of documentation. The packages are:

At present IGS implementation has some 350 files ranging through field-geology, bore-hole records, geochemical field data, geochemical analysis data, bulk mineral-resource data, mineral production and trade statistics, paleontology, petrology, geotechnical data, hydrogeological data, structural geology data, and administrative files. The files range in size up to 10,000 records and in record length up to 500 bytes. There are some 40 active users and the system handles up to 700 G-EXEC jobs per month. Each G-EXEC job is two jobs to the computer-operating system. The system also is implemented at IGS Edinburgh on the in-house POP 11/45 and also on the ERCC (Edinburgh Regional Computing Centre) IBM 370/158. Smaller versions of the system are in the:

ASSOCIATED CONCEPTS

G-EXEC is envisaged as a systematic approach to the handling of geoscience and other data. In the national framework the concept of national geoscience data banks held centrally by IGS and satellite-processing sites (such as universities) for both input to the central data banks and retrieval and subsequent local processing of derived files from the banks, is gaining acceptance. The exchange of data between centres is of great importance, and to that end a communication methodology for use between generalized data-base management systems has been devised by Prof. Peter Sutterlin who has worked in the UK for one year with the G-EXEC team.

We have discovered that some scientists from other centres are keen to develop software with the G-EXEC philosophy and a real chance of inter-centre cooperation in software development seems possible. It is hoped that cooperative work on data and software may lead, in time, to joint research and increased knowledge.

FUTURE PLANS

It is intended that there shall be greater variety of input techniques into the system to try to lower the energy barrier that currently exists with conventional input methods. Similarly, the acceptance of computer-output images is limited and more work is required to make the output more meaningful to the user.

As the volume of data stored increases it will be necessary to do more work on the utilization of a variety of logical storage structures to mirror the models of the world studied and on the efficient mapping of those logical structures onto computer hardware storage devices.

Finally, the concept of generalized integrated simulation of geoscience processes based on the systems approach must remain a firm objective if we are to gain understanding of the earth.

REFERENCES

CODASYL, 1971, Programming languages: ACM Data Base Task Group (DBTG), New York, 269 p.

Loudon, T. V., 1974, Analysis of geological data using ROKDOC, a FORTRAN IV package for the IBM 360/65 computer IGS Kept 74/1, 131 p.

Note

Later papers on G-EXEC written by ACL staff or the NERC Unit situated at Chilton are available at G-EXEC.COM. The most relevant are:

⇑ Top of page
© Chilton Computing and UKRI Science and Technology Facilities Council webmaster@chilton-computing.org.uk
Our thanks to UKRI Science and Technology Facilities Council for hosting this site