Jump over left menu
FORTRAN Compilers and Loaders
R E Thomas
ACD: Engineering Paper No 42
This paper gives details of the differences between the various implementations of FORTRAN, and the facilities offered by the loaders on the machines proposed for the Interactive Engineering Facility. The ICL machine considered is the 1906A, although recent events suggest a 2980 should be added. Unfortunately, insufficient evidence is available to give much of a comparison.
The FORTRAN section includes the results of two benchmarks. One set or three programs gives an indication of the type of code produced by the compilers. The other program gives an indication of how errors are handled.
2. FORTRAN LANGUAGE FACILITIES
This chapter deals with the extensions that are available in the various machine dialects. It serves to emphasise the diversity of these additions - in some cases, the same syntax has been used to mean different things.
It is possible to consider some extensions in two ways. They increase the features available to the FORTRAN programmer, giving him more freedom of action. However, if there is any chance that the program will be transferred to another machine, the inconvenient use of these extensions causes considerable problems. This is particularly true when some standard restrictions are relaxed. in this case, it is extremely useful to have a compiler option which will indicate deviations from the standard and also to have a manual which provides this information.
Comments are made under various type headings, giving details of compiler variations. There is also a section on the type of manuals provided. The language versions considered are:
- ICL 1906A
- Univac 1100
- Burroughs B6700
- CDC Cyber 170
2.2 FORTRAN manuals
All the manuals are designed as reference manuals, with separate chapters on various sections of the language. The DEC and CDC manuals explicitly describe the online use of the language. Both give clear indications of where their versions differ from standard ANSI FORTRAN (although DEC is not completely accurate on this) and ICL gives an appendix listing the differences, Univac and Burroughs give no such indication.
DEC and Univac give sections on how to program, with hints on pitfall avoidance.
CDC provide a considerable number of program samples.
2.3 Character sets
Univac allow lower case.
DEC allow a wider range of extra characters.
CDC allow ≠ for '.
DEC and Burroughs allow more than one statement on a line, separated by ;. CDC also allow this, but use $ as separator.
DEC have a special online free format syntax, allowing tab to skip to column 6 or 7. if used, only digits are allowed to indicate continuation.
Comment lines may be indicated by $ or * (CDC) or $, *, / or ! (DEC). Burroughs and DEC allow in-line comments (% and ! respectively).
Both CDC and DEC have special debug lines. DEC has normal FORTRAN statements with D in column 1, which can optionally be compiled.
ICL allows 32 character names. Burroughs allows 31 dimensional arrays while UNIVAC and ICL allow 7. DEC has no limit and CDC 3.
CDC have debug control statements indicated by C$.
CDC specify listing directives by C/. Burroughs use $ for compiler control words.
The following table shows the type and accuracy of the constants available (number of digits or characters). Hex constants are indicated by Zn. Octal constants are indicated by On (Burroughs, Univac), nB (CDC) and "n (DEC) ICL only allow hollerith constants in data initialisation, or parameters. The following table indicates the precision of constants.
All compilers allow Complex and Logical constants. CDC and Burroughs allow .T. as an abbreviation for .TRUE.
The following table gives details of number of characters in a variable name, type of array subscript and maximum number of dimensions.
|Names||7||Truncated to 6||6||32||Truncated to 6|
|Type of subscript||Arithmetic expression||Arithmetic expression, including -re||Arithmetic expression||Integer expression||Arithmetic expression|
|Maximum Dimension||3||31||7||7||No limit|
2.7 Types of Expression
CDC allow masking expressions, which involve using logical operators .AND. etc on non-logical variables.
CDC also allow full set of relational expressions with COMPLEX data types. DEC allow relation only for .EW. and .NE.
DEC allow + for exponentiation, and have defined .XOR. and .EQV. Sign is used to indicate true or false, and arithmetic expressions can be included in logical operations. < > and = are allowed.
Burroughs have defined .IS. for identity. There is also a VALUE construct, used in I/O type control.
Univac use character expressions as a special type. Strings may be concatenated with &. Logical operations on non-logical variables are performed by typeless functions.
ICL, UNIVAC and CDC a1low multiple assignment statements. The first two use commas, while CDC repeatedly uses equals.
Univac allow character assignments, and CDC allows masking assignments.
This section is separated into the individual control statements, since there is a considerable amount of diversity.
2.9.1 Computed GOTO
Control variable out of range causes a fatal error in CDC, while all the others allow control to pass to the next statement.
The comma may be omitted in CDC, Burroughs and Univac, but should be included in ICL and omitted in DEC.
CDC and DEC allow expressions in place of the variable.
Univac allow blank labels to indicate the next statement.
2.9.2 Assigned GOTO
CDC, Burroughs and Univac allow the comma to he omitted. All except CDC allow the label list to be omitted. Univac allow blank labels.
2.9.3 Arithmetic IF
CDC allows a two-branch arithmetic IF. Univac allows blank labels.
2.9.4 Logical IF
CDC allows a two-branch logical IF.
2.9.5 DO loops
Univac allow an optional comma before the equals.
DEC allow any expression as a parameter (including negative), which will be truncated to integer.
ICL and Univac allow integer (positive) expressions as parameters.
CDC allow octal parameters.
Burroughs allow real parameters and real Control variable. the parameters can be altered from within the loop.
where x is:
- DEC, Burroughs: Any expression, which will be converted to an integer position in a list of labels on the subroutine call.
- Univac: As above, but the expression must be integer.
- ICL: As above, but only an integer constant or variable is allowed.
- CDC: A dummy variable name, corresponding to one specified in the RETURNS list in the Subroutine Statement.
All allow an IMPLICIT declaration to define data type.
ICL, Burroughs and Univac allow data initialisation in a type specification. In addition, Univac allow this in a dimension statement.
DEC allow minimum and maximum values of array dimensions to be given.
CDC allow numbered COMMON blocks.
Univac have a type CHARACTER. An optional l is allowed in EXTERNAL.
DEC and Univac (and ICL) allow source to be included from elsewhere. Univac (and ICL) allows various editing to occur as well.
CDC and Univac have statements to specify where data is to be placed in store (LEVEL and BANK respectively).
Univac allow names to be given to constants (PARAMETER).
2.11 Subroutines and Functions
CDC and ICL require a statement to head a main program (PROGRAM and MASTER).
All allow label passing to subroutines. In each case except for CDC, this is indicated by a * in the dummy argument list. ICL and Burroughs use & label in the subroutine CALL, DEC uses $ label, and Univac allows either.
CDC use a separate RETURNS list on subroutine statement and CALL.
Multiple entry points are allowed. No argument list is allowed with CDC (assumed to be the same as that on the main routine).
Burroughs allow recursion.
Univac require a DEFINE for statement functions.
There is a considerable variety in type and scope of the intrinsic functions available.
This will be divided into various subsections.
All have a namelist feature. CDC and DEC use $ name at the head of the list, and $ at the tail. ICL and Burroughs allow & name at the head and &END at the end. Univac allows either.
2.12.2 List directed
CDC, DEC and Univac indicate this by * in place of the format statement number. CDC Minnesota FORTRAN omits the statement number, and allows layout variation by access to a special COMMON block.
Burroughs use / for input, and have various options for output.
ICL use FORMAT types with zero field width.
2.12.3 I/O from store
CDC, DEC and Univac use ENCODE, DECODE.
ICL use DEFBUF to associate an array.
Burroughs allow an array name in place of unit number.
2.12.4 File handling
There is a considerable diversity on how files are handled. Burroughs have a set of OPEN, CLOSE routines. DEC allows file names to be provided interactively at run time.
A specific record in a random file is identified by 'R (ICL, Univac), #R (DEC) or =R (Burroughs).
All except CDC allow action on end of file or data error. CDC Minnesota FORTRAN does allow action on end of file or data error by the unusual form .END. =, .ERR. =.
2.12.5 I/O Statements
All allow PUNCH and PRINT. DEC have ACCEPT, TYPE for terminal I/O, and also allow REREAD.
Burroughs allow expressions in I/O lists. It is possible to obtain status information about any transfer, and to get information on the connected device type.
CDC Minnesota FORTRAN has an unusual in-line form of PRINT.
2.12.6 Format Statements
CDC allow the format to control the conversion required. ICL insist that the I/O list and the format correspond. Burroughs allow differences to be trapped.
The following unusual facilities are provided:
- Cw (Burroughs) Right justified text
- E (CDC) Specify exponent length
- Iw.z (CDC) Minimum number of digits specified
- J (Burroughs) Left justified integer
- J (Univac) Zero filled integer
- K (Burroughs) Modifier for other types, inserts comma after every three numbers
- O (all except ICL) Octal
- R (CDC, DEC, Univac) Right justified text
- R (CDC Minnesota) Right justified, zero filled text
- S (Univac) Sign control
- U (Burroughs) Universal output type (any)
- V (CDC, Burroughs) Take character from I/O list as format type
- Z (CDC, Burroughs) Hexadecimal
- $ (DEC) Suppress carriage return on terminal
- $ (Burroughs) Output $ at this point
CDC and Burroughs allow variable field widths, where the value is taken from the I/O list (= and * respectively).
2.13 Strict ANSI
CDC provides a compiler option to test for variations from the ANSI standard. This is of great assistance when transferring programs between different machines.
No direct details of optimisation are available. However, it would appear that ICL is able to recognise common sub-expressions across statement boundaries, can take code out of DO loops and make use of its accumulators. It will generate simple DO parameter loops only in certain special cases, however. It is possible to request that functions with side effects be left in their proper place.
No direct details are available.
The optimiser attempts the following:
- Elimination of redundant computations by using registers or temporary workspace.
- Elimination of non-essential stores of variables.
- Movement of code out of loops.
- Replacement of loops control variable, and expressions involving it, by temporaries.
- Maximum use of registers
It is noted that certain assumptions made by the optimiser may cause worse, or even erroneous code; for example:
- The initial portion of a loop is assumed to be executed less frequently than the loop.
- ASSIGNED GOTO's with incorrect lists will give incorrect results.
- The optimiser handles smaller programs than the standard.
Burroughs allow optimisation levels -1, 0 and 1. 0 is the default. Optimisation levels can be changed from one routine to another. The general features are:
- Slight local statement optimisation. Floating point constants with zero fractional part stored as integer.
- This stores floating point constants in normalised form, and insists on strict left to right operand evaluation.
- Full optimisation (details not available). Impossible if a subprogram has a label as formal parameter.
- This allows use of special hardware when compiling certain types of loop.
There are three optimisation levels:
- OPT = 0
- Local statement optimisation. Constant expression evaluation.
- OPT = 1
- Simple loop optimisation by using registers. Redundant expressions removed across statements.
- OPT = 2
- Values of simple variables not stored if not referenced. Code removed from loops. Code dependent on loop variable is simplified. Registers used in loops whenever possible.
In addition, there is an unsafe addition to option 2, involving:
- No check on indexed array references removed from loops.
- Assumptions on register preservation when calling math library function.
2.15 Brief Comparison Table
2.15.1 Ranked Features
|Size of Reals||4||3||2||1|
|Size of Octals||3||4||2||1|
|No of Dimensions||1||3||2||4|
2.15.2 Features Present
|Free Format Source||Yes|
|No need of PROGRAM statement||Yes||Yes||Yes|
|Variable field width||Yes||Yes|
|2 branch IF||Yes|
2.15.3 Table Reproduced from FORTRAN PROGRAMMING, 1970
|No of Continuations||20||infinite||infinite||20|
|Max Statement No||99999||32767||99999||99999|
|Decimal Precision||-38 to 38||-38 t0 38||-47 to 68||-293 to 322|
|Integer max||11 digs||11 digs||12 digs||18 digs|
|Literal transfer||Yes ' '||Yes ' '||Yes " "||Yes * *|
|Max A field||5 ||6||6||10|
|Subscript, integer exp||Yes||Yes||Yes||Yes|
|Double Precision||16 digs||18 digs||23 digs||29 digs|
- DEC: PDP-10
- Univac: 1108
- Burroughs: B6500, B7600, B8500
- CDC: 6600, 7600 EXTENDED FORTRAN
This table is included for interest only. Any differences with the previous data are due to the early date of this survey.
3. SYNTAX ERROR REPORTING
3.1 The Program
The program has been constructed to tryout a number of syntax error, extensions and unusual features. It is of interest in that it highlights the differences between the various compilers in their attitude to the unusual.
- Line 1
- Comment before routine start should be accepted.
- Line 2
- 0 in column 6 implies no continuation. Spaces are ignored.
- Line 3
- COMMON block names can be the same as a dummy argument (YP) or a function (COM).
- Line 4
- COMMON blocks may be extended by using the name twice.
- Line 6
- The label should be accepted. Q(NQ) is illegal.
- Line 7
- COMPLEX has more than 6 characters and. also has 7 dimensions.
- Line 9
- ZPLUS4 may be singly dimensioned in an EQUIVALENCE.
- Line 10
- Strictly illegal ANSI to include an array in a statement function.
- Line 12
- DATA statements may occur anywhere.
- Line 15
- The label should be accepted.
- Line 17
- The subscript on ZPLUS4 is acceptable since the whole subscript is within range.
- Line 25
- Legal assignment statement.
- Line 26
- DATA statement may occur anywhere.
- Line 32
- Strictly, functions may not have hollerith arguments.
- Line 33
- Legal to have an arithmetic IF following a Logical IF. Labels 2 and 3 mean jump into DO loops.
- Line 34
- XF is the routine name, so this would imply recursion.
- Line 35, 36
- Label 10 defined on line 6.
- Line 38
- Unlabelled FORMAT
- Line 43, 44
- K and L are used in label situations (ASSIGN) and in arithmetic. Labels may not be arithmetically assigned.
- Line 45
- DATA statement may occur anywhere. However, the text string is probably too long for a single word.
- Line 46
- K should not be used like this.
- Line 50
- Not strictly illegal, but it does define a possible infinite loop.
- Line 52
- Jump back into DO loop.
- Line 54
- Adjacent operators.
- Line 58
- Long constant, normally truncated.
- Line 60
- Label is ignored.
- Line 61
- Strictly, NV is not defined since V has been assigned and is equivalenced to NV.
- Line 62
- Cannot read into a statement function.
- Line 65
- Expressions in IO list.
- Line 66
- Duplicate label.
- Line 68
- No reserved words 1n FORTRAN. Suhscript expression strictly illegal, but usually accepted.
- Line 69
- Should be RETURN or jump before END.
- Line 70
- Strictly, END must be in cols 7, 8 and 9, but usually accepted.
- Function variable XF is never set.
0001 C THIS IS A COMMENT 0002 0INTEGER FUNC TION XF(YP) 0003 COMMON / YP / Q, NQ // Z(4) /COM/ COMX 0004 COMMON / YP / MORE 0005 C** 0006 1 0 DIMENSION S(12),FORMAT(8),Q(NQ) 0007 DIMENSION COMPLEX(1,1,1,1,1,1,1), ZPLUS4(1,8) 0008 C 0009 EQUIVALENCE (ZPLUS4(1),Z(1),NV,V) 0010 B(I)=S(I) 0011 +*1. 0012 DATA PB / 1H / 0013 COM(X) = (1.0) 0014 C(J) = FLOAT(J) + 9.0 0015 60 TAN(X) = SIN(X) / COS(X) 0016 C## 0017 ZPLUS 4(2,1) = 99999. 0018 X = 0.0 0019 WRITE(6,100 ) 0020 1000 FORMAT(//2X,A1,(3(/2 H )))) 0021 TOT = 0 0022 RV = PB 0023 NV = 0 0024 C 0025 DO 1 I = 2.4 0026 DATA FORMAT / 8*0.0 / 0027 1 TOT = TOT + FORMAT(I) 0028 I = 1 0029 IF(YF.NE.(YF*YF)/YF) GOTO 9 0030 C 0031 IF(YF * * 2. * * 1. GT .-1. 0E - 4) GO TO 07 0032 N = NFUNC('9') - NFUNC(1H9) 0033 IF((3. LT ..4)) IF ( 3.-.4) 1,2,3 0034 Y = SIGN(XF(0.0),X) 0035 DO 10 L = 1,90 0036 10 IF(L.EQ.90) CONTINUE 0037 C 0038 FORMAT(3X) 0039 C 0040 DO 2 I = 1,10 0041 ASSIGN 4 TO L 0042 ASSIGN 3 TO K 0043 K = J 0044 IF (I.GT.5) K = L 0045 DATA A / 10H1234567890 / 0046 GOTO K 0047 3 CONTINUE 0048 CONTINUE 0049 DO 2 K = 1,9 0050 6 IF((VP)) 2,2,6 0051 2 CONTINUE 0052 GOTO 6 0053 C== 0054 4 I = -6/(-4)+3+MIN0(2,3)*-1 0055 GOTO (1,1,1,1),I 0056 GOTO 0009 0057 C 0058 000 9 JUNK = JUNK + NINTF(27,1234567890987654321) 0059 7 V= 0060 7+0 0061 WRITE(6,10) NV 0062 READ(5,200) B(I) 0063 READ5,200) T,FORMAT,((N,M), J=1,1,1) 0064 200 FORMAT(I2/(A4/)) 0065 WRITE(6,2)(VP),((10),I+1,(9) 0066 2 FORMAT(F10,6) 0067 ENDFILE 6 0068 FORMAT (1*1*1) =0.0 0069 PAUSE 777 0070 0E N D
3.2 Review of errors
The table below shows the number of errors detected by the various compilers.
|CDC Minnesota MNF||12||48||5|
The following is a detailed review of these errors
- Line 1
- Comment before routine start should be accepted.
- Line 2
- CDC programs had a PROGRAM statement inserted in error at the front.
- Minnesota FORTRAN picked up NXF as a form of DIMENSION - it is now clear how this was achieved.
- Line 3
- ICL gave an error on duplication of common block name and variable.
- Minnesota FORTRAN commented on empty COMMON block name.
- Line 6
- All compilers noted Q(NQ) error.
- ICL ignored label.
- DEC F40 accepted label, but Q(NQ) error caused whole DIMENSION statement to be ignored.
- DEC FOR noted Q(NQ) error at the end. It gave an error on the label.
- Univac noted duplicate label 10 at this point.
- Burroughs ignored label.
- CDC ignored label.
- Line 7
- DEC FOR commented on truncated variable.
- Univac gave error for long name.
- Burroughs commented on truncated variable.
- CDC objected to more than 3 subscripts.
- Line 9
- CDC objected to single dimension of ZPLUS4.
- Line 10
- ICL comment that statement function is not referenced.
- Univac note that array S is never assigned a value.
- Line 12
- Minnesota does not allow DATA statements to appear anywhere.
- Line 13
- ICL comment that statement function is not referenced.
- Minnesota comment on strange form of function.
- Line 14
- ICL comment that statement function is not referenced.
- Line 15
- ICL ignore label and comment that statement function is not referenced.
- DEC FOR gives error on label.
- Burroughs ignore label.
- Minnesota ignores label, BUT still comments on label not being used!
- Line 17
- ICL comment on ZPLUS4 dimension
- Minnesota also comments on dimension.
- CDC FOR gives error.
- Line 22
- Minnesota gives warning that PB not defined.
- Line 23
- Minnesota comments on same left hand side as previous statement.
- Line 25
- Minnesota comments that this looks like a mispunched DO.
- Line 26
- DEC F40 gives error because FORMAT is not recognised as array (line 6 error).
- Minnesota only allows DATA at the beginning.
- Line 27
- Minnesota warns that items not yet defined.
- Line 29
- Minnesota warns that YF not yet defined. Also warns on floating point equality.
- Line 31
- ICL comment on order of evaluation of double exponentiation.
- Minnesota comments on constants, but gives error on double exponent without brackets.
- Line 32
- Minnesota warns that expression seems to be identically zero.
- Line 33
- Univac warns of jump into DO loop, and gives error on duplicate on label 2.
- Burroughs gives error on duplicate on label 2.
- Minnesota collapses statement to GOTO 3. Not clear if this is whole statement or just part of it, since logical IF is identically false.
- Line 34
- ICL gives error on XF reference.
- DEC gives error on XF reference.
- DEC FOR gives error also on wrong type in SIGN, assuming it to be a library routine (!).
- Burroughs allows recursion.
- Line 36
- DEC F40 gives error on duplicate label 10
- DEC FOR does not allow CONTINUE after logical IF, and ignores the label 10.
- Univac gives error on duplicate label 10.
- Minnesota gives warning on use of CONTINUE.
- Line 38
- DEC F40 does not comment.
- Univac and Minnesota gives warnings only.
- CDC gives strange error.
- Line 40
- Burroughs objects to use of label 2 here (since it is taken as FORMAT number).
- Line 41,42
- Minnesota comments on labels appearing in ASSIGN but not in GOTO list.
- Line 43
- Minnesota warns that J is not defined.
- Line 44
- Minnesota warns on ASSIGN use of L.
- Line 45
- ICL gives error on too long constant.
- DEC F40 also gives error.
- Univac gives warning.
- Burroughs gives warning.
- Minnesota does not allow DATA statement.
- Line 46
- CDC insists on ASSIGNed GOTO list being present.
- Minnesota comments on lack of label.
- Line 47
- Minnesota comments on previous jump to label 3, and the fact that CONTINUE is not used as DO terminator (!).
- Line 48
- Minnesota comments on lack of label.
- Line 49
- ICL XFEH gives error on transfer of control into loop.
- Line 50
- ICL comment on own label reference.
- DEC F40 gives error on own label reference.
- Minnesota gives warning on YF not defined and on own label reference.
- CDC gives warning on own label reference.
- Line 51
- Burroughs gives error on previously defined label 2, although second label 2 is further on.
- Univac gives error on duplicate label 2.
- Line 52
- Minnesota gives warning on jump into DO loop.
- Line 54
- DEC F40, DEC FOR, UNIVAC do not give an error for * - combination.
- Line 55
- Minnesota comments on equivalence to GOTO 1.
- Line 56
- Minnesota and CDC ignore this line.
- Line 58
- Univac warns of too long number.
- Minnesota warns on number length and JUNK not being defined.
- CDC warns on number length.
- Line 60
- Minnesota warns on possible mispunching. Univac also comments on label field values.
- Line 62
- CDC does not object to this use of statement function in IO list.
- Line 63
- DEC F40 objects to use of FORMAT.
- Univac warns about implicit DO limits, and so do Minnesota and CDC.
- Line 65
- ICL objects to I+1 in IO list
- DEC F40 objects to statement function.
- DEC FOR objects to I+! and label 2.
- Univac gives error on statement function.
- Burroughs allows the expression without comment.
- CDC and Minnesota object to Label 2.
- Line 66
- ICL does not notice duplicate label 2.
- Burroughs comments on first occurrence only, not at this point.
- Line 68
- DEC F40 does not recognise FORMAT as array.
- Minnesota has FORMAT as reserved word.
- Line 70
- ICL XFEH gave an error because the function has not been set.
- Burroughs warned that PAUSE should not be followed by END.
- Minnesota assumed STOP at END.
- CDC gave strange errors on loops beginning at this card (?).
3.3 Error Message Presentation
Errors are listed at the end of the program in a somewhat random order. No symbol maps are available.
The debugging compiler XFIH found fewer errors than the production compiler XFEH.
Error lines from the FOR compiler are buried in the text, and are difficult to disentangle. True errors are flagged by ?FTN, while comments are flagged by %FTN and the distinction is sometimes hidden. The error on line 6 was not recorded until the end of the routine.
F40 marks its errors in the text very clearly, with pointers above and below. Statements are not numbered. It is possible to obtain a full list of symbols and constants.
Errors are flagged in the code, and are reasonably clear. They tend to occur before the line in error, but this is not always the case (duplicate label error occurs after the line).
Errors appear inline, and are quite clear. Statements are not numbered. However, the pointer system (indicating where in the line the error was detected) sometimes points to strange places. Some attempt is made to include the section of code in error in the message itself, but this can be misleading, End of line is signalled by ; which does not occur explicitly.
A number of loader errors are also flagged, such as no main program, unavailable functions.
The Minnesota Fortran MNF is used at Imperial in preference to CDC standard, as the debugging compiler. Messages are given inline. Fatal errors are marked on the right-hand-side of the listing. A considerable number of warnings and comments are generated. Some of these are very helpful (such as undefined variable use. and possible mispunch), but others are misleading (particularly those associated with functions that might be in the system library).
The standard CDC compiler provides errors at the end. These are not particularly clear, and the fact that the source is listed with line numbers at every 5 statements does not help. Two unusual errors were generated at the end, but their meaning is unclear.
4. FORTRAN CODE PRODUCTION
Three of the programs (IF, GAMMA and BESSEL), used in the Bryant and Baylis FORTRAN evaluation study, 1968, were run on the compilers, at normal and optional levels of optimisation. The word lengths of the various machines are:
- ICL: 24 bit
- DEC: 36bit
- Univac: 36 bit
- Burroughs: 48 bit
- CDC 60 bit
Burroughs use a byte instruction system, geared to stack manipulation, and, although the number of orders generated is greater than the others, the number of bits used is comparable. CDC use a mixture of 15 bit (register) and 30 bit orders.
4.2. 1 1906A
ICL produce two FORTRAN compilers, one for optimisation and one for compiling and testing. XFEH was used here (the optimiser under GEORGE 4). There is no compiler option to list the code, so it is necessary to load the routine and dump the core to find out what has been compiled. There is therefore no assistance given in matching FORTRAN statements to the code produced.
Two compilers were considered. F40 and FORTRAN V (FOR). The latter had an optimising option.
F40 produced the code listing at the end of the source, giving corresponding FORTRAN source statement number. Generated labels bore some resemblance to the FORTRAN equivalents.
FOR produced code between the FORTRAN statement listings if unoptimised, or at the end if optimised. However, it was usually the case that the next FORTRAN statement was listed before the whole of the code for the previous statement had been produced.
Here again there were two compilers. FOR produced inline listings, whi1e FTN listed the code after the source, together with corresponding statement number. Labels were related to the FORTRAN equivalents.
The Burroughs compiler produces listings in line when unoptimised, and after the source is optimised. The optimised listing does attempt to relate to statements, but the reordering is so drastic that it is very difficult to discover which orders have been produced by which statement.
The CDC compiler with OPT=0 produces code at the end of the listing, interspersed with FORTRAN statement numbers. FORTRANI labels are also retained. The code with OPT=2 is much more sorted, since registers are used when possible, and FORTRAN statement numbers appear occasionally in the listing, giving some help in finding how the code has been generated.
The mixed order lengths (15 and 30 bit) mean that occasionally space is wasted at the end of a word.
4.3 Code Production
4.3. 1 Routine IF
This artificial routine is designed to determine what code is generated for IF statements. In fact, as a routine, it achieves nothing but the Burroughs optimiser was the only compiler to recognise the fact!
- This produced reasonably good code for this routine. Because of the lack of a full set of jump instructions, there was a tendency to favour a less than branch. Some notice was taken of the existence of successors, but the gain in general was only 1 order in 4.
- DEC possess a full range of jump instructions. In general, the FOR compiler did slightly better than F40 by using SKIPS instead of index loads and jumps. F40 code favoured zero branch, whereas FOR favoured a successor.
- The code produced by both optimised and non-optimised FTN was the same. The compiler favours zero branch, and takes only slight advantage of successors. The FOR compiler, however, does take much more advantage of successors. It is interesting to note that the preferred order of jumps is reversed in the two compilers.
- The Burroughs computer is a stack machine, and so produces rather more orders. However, it tends to use less bits than the other compilers since the orders are, in general, byte orders.
- The unoptimised version is able to take virtually no advantage of successors. The code increases considerably if the three branches are different, owing to the need to reload the stack for further tests.
- The optimiser correctly recognised that the routine was a null routine, and threw away almost all of the code. It was the only one to do so.
- The unoptimised compiler does not look beyond a statement boundary. It is able to take some notice of successors, and prefers equality - non-equality tests. It recognised that the DO loop would not require any repeat, and did not generate any loop code.
- The optimiser recognised that L was not used, and did not set it. By using registers it was able to save orders when successors were being used, since the labels were not needed otherwise. However, it could only produce the same code for the second set of IF's, having to reload the registers at each statement.
4.3.2 Routine GAMMA
This is a rather moderately written program to calculate Gamma functions. It does allow machines with a full order set to take some advantage, particularly on constant loading and store incrementing.
- This code suffered from the need to use a second word to access each array, and the problem of floating point comparison (typically double the number of orders). Setting up code was taken out of the double DO loop, but this only made the loop code of similar length to the other compilers.
- All compilers produced similar code. The F40 took no advantage of previously loaded registers across statements, nor was any DO optimisation carried out. FOR did take advantage of previously loaded registers. The optimiser removed two orders from within the DO, but overall produced more code.
- Both compilers required sizeable amount of code at the head and tail of the routine. The order code is such that the code within the DO loop cannot be optimised. However. FTN was able to cut down on the amount of DO control required. FTN also produced better code for the logical IF statements. FOR gave a diagnostic warning about attempts at REAL equality comparisons.
- No optimisation was attempted, even across the DO loop. The long constants were stored in total, even though the precision was too great, and a rounding operation included in the code. The optimiser halved the code; in particular, the code in the DO loop was reduced from 27 to 14 orders.
- The unoptimised version did not take any account of items in registers across statement boundaries. The code produced was slightly longer than that generated by the other systems. The optimiser was able to keep variables in registers, and, noting that there were loops associated with labels 1 and 2, took code out of the loop, just as if it were a DO loop. Code was reordered to avoid adjacent reference to the same register. Registers were used in the nested DO loop at the end as well.
4.3.3 Program BESSEL
This routine computes BESSEL functions. It uses single subscripting extensively, with some constant subscripts and private arrays.
- Orders Here taken out of DO loops where possible, but no use was made of the MOVE instruction. Routines were required to handle cube powers, AINT and ABS. Again, although considerable use of common accumulators was used, the code was considerably lengthened by the need for two words for each array access.
- The FOR optimiser produced the same number of orders as the unoptimised version, but the orders were slightly different. F40 used one less order to compile a statement in a number of places, dropping a MOVN instruction. The significance is not clear. F40 required a routine for ABS, but FOR has slightly better IF statements. Both require a routine for AINT.
- No extra routines were required by either compiler. FTN was able to avoid a loop when copying one array to another, recognised and removed a redundant store operation on GEE, and managed to cut down the other major DO loop considerably. Both compilers tended to do worse than other manufacturers on the IF statements. Also, on some statements, where functions were called, FOR produced shorter code. Extra orders were also needed by FTN at statement 112.
- Again, the optimiser produced considerable reductions in code, this time by one third. In the ordinary compiler, negative constants were stored positive and negated by order. In general, less bits were required per statement, although it performed slightly worse on IF statements.
4.4 Comparison Tables
These tables show the order of the compilers when the total numbers of words, bits and instructions are considered.
|Burroughs Opt||136||DEC FOR||246||Burroughs Opt||6528|
|Burroughs Unopt||193||DEC F40||250||DEC FOR||8856|
|DEC FOR||246||Univac FTN||267||DEC F40||9000|
|DEC F40||250||Univac FOR||291||Burroughs Unopt||9264|
|Univac FTN||267||ICL XFEW||389||ICL XFEW||9336|
|Univac FOR||291||Burroughs Opt||575||Univac FTN||9612|
|ICL XFEW||389||Burroughs Unopt||616||Univac FOR||10476|
|Burroughs Opt||49||DEC FOR Unopt||88||Burroughs Opt||2352|
|CDC Opt||75||DEC FOR Opt||90||ICL XFEW||3169|
|DEC FOR Unopt||88||DEC F40||96||DEC FOR Unopt||3168|
|CDC Unopt||90||Univac FTN||99||DEC FOR Opt||3240|
|DEC FOR Opt||90||CDC Opt||104||DEC F40||3456|
|DEC F40||96||Univac FOR||110||Univac FTN||3564|
|Univac FTN||99||CDC Unopt||128||Univac FOR||3960|
|Burroughs Unopt||100||ICL XFEW||132||CDC Opt||4500|
|Univac FOR||110||Burroughs Opt||239||Burroughs Unopt||4800|
|IXL XFEW||132||Burroughs Unopt||239||CDC Unopt||5400|
|Burroughs Opt||5||Burroughs Opt||24||Burroughs Opt||240|
|CDC Opt||46||DEC FOR||77||ICL XFEW||2040|
|Burroughs Unopt||52||Univac FTN Opt||83||Burroughs Unopt||2496|
|CDC Unopt||53||ICL XFEW||85||CDC Opt||2760|
|DEC FOR||77||DEC F40||86||DEC FOR||2772|
|Univac FTN Opt||83||CDC Opt||87||Univac FTN Opt||2988|
|ICL XFEW||85||Univac FOR||91||DEC F40||3096|
|DEC F40||86||Univac FTN Unopt||98||CDC Unopt||3180|
|Univac FOR||91||CDC Unopt||104||Univac FOR||3276|
|Univac FTN Unopt||98||Burroughs Unopt||170||Univac FTN Unopt||3528|
In general, the compilers all seem to produce reasonable code. ICL are hindered by the poor addressing, and by the shortage of jump orders. DEC do n make use of their full set of orders. The optimiser got very l little extra out of these routines. Univac optimisation is good, but again they are hampered by the shortage of jump orders. Their order code did however allow the inline production of code for some of the intrinsic functions. Burroughs was the only compiler to produce completely unoptimised code as standard, but even so tended to use about the same number of bits as the other compilers. The optimiser made great savings by reordering the stack and preloading variables to appear at the top at the right time. CDC produced an almost unoptimised version as standard, and also seemed to make good savings by recognising loop structures whenever they occurred and using registers therein. There was also an attempt to maximise instruction overlap,
5. DEBUG AIDS
This is a brief description of the programs available to assist in debugging a running program.
There are no special debug aids available.
The FORDDT system allows interactive debugging of a FORTRAN program. The user can:
- change contents of a variable
- set up 10 "pauses"
- restart after a pause, and display all pauses
- type contents of variable, and display all symbols
- trace subroutine calls.
In addition, DEC allow debug insert lines, signalled by D in column 1, which will be compiled only if the correct compiler option is set.
The FORTRAN checkout mode can be detected by the C and Z options. This allows the user to trace execution, halt and dump variables. Checkout can be entered after a program has started execution. No optimisation is possible. A maximum of 8 break points are allowed.
Various DEBUG statements can be inserted into the source, and compiled only if the correct compiler option is set. These are special statements:
- Output information on changes in value of specified variables.
- Dump information at a specified label, subject to frequency instructions.
- PROGRAM DUMP
- Dump snapshot of whole program.
- Timing study of execution.
- Sequence of subroutine calls, labels.
The compiler has subroutines to dump memory in octal, real or integer form, and to trace subroutine calls.
There is also a debug option to check array bounds, assigned GOTO, subroutine calls, function values, program flow and values stored. No optimisation is possible. Debug statements are signalled by C$ in the code. Variables can be listed when they become greater, less than or equal to a constant.
These are called consolidators. They link in library routines and produce binary modules either in core or saved on file.
Libraries can be stored in one of three formats: direct compiler output, linked list or compact. The last two are generated by separate utilities; the linked list is the only updateable version, but is slow to scan, while the compacted version is normally obtained from this linked list. Nottingham University have produced a separate library builder.
Libraries are scanned in order. Routine order must be arranged correctly by the user. However, the Nottingham system does provide an automatic ordering scheme.
Overlays are only available under the non-paged consolidator. They are defined as a level structure, with many available levels. Overlays must be called in by the routines themselves, but FORTRAN generates the necessary calls by compiler switch.
The consolidator provides a map of the symbols, together with their location. In the non-paged version, the overlay level and segment numbers are given, but there is no ordering of the symbols. No cross referencing or overlay structure maps are provided.
This is called the Link Loader. It can be run either automatically by LOAD, EXECUTE or DEBUG, or by a specific call. In the latter case, more facilities are available.
The automatic mode is driven by file extension. It is possible to request versions of the loader from different directories.
Two versions are available., LINK-10 and LOADER. The first has more features.
Libraries are scanned in the order specified. A utility, FUDGE2, allows for their creation and updating. In direct mode, very extensive facilities are provided, some of which are interactive. It is possible to specify different libraries at different points in the load process. Symbols may also be defined at a terminal. The user can set his own defaults, and Local symbols can be saved for use by DDT later.
An overlay tree structure is provided. The whole path to a node must be in core for that node to be accessible. Routines are loaded automatically when referenced.
A utility program will give the order of cal1ing sequences among components, and this can be used to set up the structure. The dynamic structure must be obtained by analysis (the log file will give some help here).
The overlays are held in two files - one containing the root, and the other the rest of the structure. This second file must be contiguous.
A comprehensive set of output is available to the user. At any point in the loading, a list of undefined symbols can be obtained. A symbol file can be generated and sorted. It is possible to get a graphic representation of the overlay structure.
6.2.4 Other Details
COMMON blocks must be loaded explicitly, with the largest block first.
There is a test mode for entering debug.
Core is divided into low and high. Each is saved in a separate file.
Sharable code can be generated.
This operation is called Collection. The Collector sets up the bank structure, and also prepares files.
Library files are scanned in order. Routines are grouped as elements in these files. It is possible to exclude elements, and to set undefined symbols by parameter.
A PREP parameter will obtain a symbol table for a file, which may then be used as a library. Compiler output is sent to an element, so files may be extended. I cannot find details of file ordering procedure.
This is done by defining segments. The start address of a segment can be determined in a number of ways, including highest of a set of routines. They can be loaded implicitly or explicitly. Dynamic segments are not included in the program initial core requ1rcments.
However, there is also a system of banks, independent of segments. Instructions and data are separated into I-banks and D-hanks, and there can be multiple banks. Segments can lie within a bank, or cross bank boundaries.
There are multiple location counters, which can be assigned to different banks.
Two listing levels are available. The fullest provides details of the symbol table, external references of each element and a scale drawing of the program segmentation and bank structure. It is possible to provide a diagnostic for addresses above 65K, or to allow them.
6.3.4 Other Details
Maximum I-bank size is 65K.
Re-entrant code can be produced.
Output can be absolute or relocatable.
There are a number of debug aids available, including a flow analysis program for frequency and timing (FLAP), and a debug package, SNOOPY, with breakpoints but no symbols. A SNAP directive allows snapshots.
The FORM directive allows easy duplication of a previously specified program structure.
This is the System Binder. Normally, routines are compiled into segments which will automatically be brought down when necessary by the operating system.
Library routines require to be at the correct lexicographic level for calls to be correctly satisfied. This is no problem in FORTRAN where all routines are at the same level. A main program (host) is bound together with its subroutines. New subroutines and replacements cab be bound with the previous collection, so a separate library facility is not used. However, system libraries can be built using the Intrinsic Binding feature, which does not need a host. Whole files or specific routines may be bound, or files scanned to satisfy references. The LIBRARY option on a compiler allows more than one subroutine per segment. References can be renamed. Intrinsic Binding is not available to FORTRAN routines.
No overlay system is provided. Since the whole concept of the machine is to use separate segments, there is an automatic structure imposed, which requires only active segments to be present.
Listing output options include a list of input cards, identifiers, segment dictionary changes (when binding new routines to an already bound program), and address couples of the identifiers. A hexadecimal dump of the object code is available, and error messages can be sent to a separate file for CANDE. There is even a TRACE option to debug the Binder itself!
6.4.4 Other details
Warning messages can be issued if the possibility of an error exists.
It is better to bind new routines to a bound set, rather than rebind the lot again.
The loader takes its commands from the JCL stream, and generates temporary files with reserved file names. The loader can be called in by program, setting up his own request table.
Libraries can be generated and maintained by NOS commands. Libraries can be defined as global to the whole process) or local. Libraries are searched circularly, so there is no internal ordering required. It is possible to force the loading of a library item, and to handle multiple libraries, Names can be altered at load time.
There are three levels into which routines can be overlaid. One level is the base, the second level can have 64 sections, and the third level allows 64 sections for each second-level section. Overlays must be called explicitly from the program.
A more general tree-structured segmented system may become available.
Load maps can be sent to file. Cross reference listings are available, together with statistics and entry point maps.
6.5.4 Other Details
There is a TRAP option to allow a certain amount of debugging. This includes snapshots, and values of locations within a given range.
The ranking of the various FORTRAN compilers is difficult since it depends on so many conflicting features. This is also true of the loading facilities, since they are all organised differently.
The following is an attempt at a (very personal) ranking (omitting ICL). The double figures in the clarity of message row refer to different compilers.
|Number of errors found||4 1||3||5||1|
|Clarity of error messages||5 3||5 1||2||4|
|Number of words compiled||2||3||1||3|
|Number of bits compiled||4||2||1||2|
|Number of instructions compiled||3||1||4||1|
|Overall Order (equal weight)||4||1||2||3|