Chilton::ACL::TREE-META Manual

7. SYMBOL TABLE

7.1 Syntax

symbolrule ::= identifier := outexpression

7.2 Symbol Table Entry

Strings, identifiers and numbers in a source program are normally recognised by the basic types in Syntax Rules and stored as leaves in the tree currently being built. The generated code may use these strings by calling the relevant output command as the tree is being processed. In most computer languages, identifiers and numbers tend to have properties associated with them which influence the form of the code generated. These properties are sometimes defined explicitly while others may have implicit declarations. For example, an Algol program requires every identifier to have its type declared explicitly. This declaration also defines the scope of the identifier - that is, the part of the program in which the identifier may be used. In order to generate code which depends on information supplied by earlier statements, there is a need to provide a Symbol Table which can be used to store strings, together with certain properties associated with them. Routines must be provided to access this table and interrogate it.

The TREE-META Symbol Table provides the user with storage for up to 400 strings. These strings will usually be identifiers recognised in the source program. However, there is no reason why numbers or strings appearing in a tree should not also be added to the Symbol Table. Each Symbol Table entry has associated with it three values which have names TYPE, LEVEL and VALUE. Whereas TYPE and VALUE may be used for any purpose, LEVEL is primarily used to differentiate between the same name declared at different block levels in an Algol-like language. Thus, the name, together with the contents of LEVEL, uniquely define a Symbol Table entry. The properties of these three values are as follows:

TYPE: an integer in the range 0 to 2047. TYPE is normally used to differentiate between identifiers of different types. For example, in Algol, integer scalars might be entered into the Symbol Table with TYPE set to 1, real scalars with TYPE set to 2 and so on.
LEVEL: an integer in the range 0 to 2047. LEVEL is mainly used to differentiate between identifiers having the same name but appearing at different block levels. It is possible to have a number of Symbol Table entries with the same name but different LEVEL values. Normally, the outer block level is associated with a LEVEL value of 0. The next inner block level has associated with it a LEVEL value of 1 and so on.
VALUE: an integer in the range 0 to 2²³ It may be used to associate any value with the Symbol Table name.

The three global variables TYPE, LEVEL and VALUE need to have the necessary values associated with them before a Symbol Table entry is made. This is done using the standard arithmetic statements defined in Chapter 6. For example:

< TYPE<-1 ; LEVEL<-0 ; VALUE<-3 >

is a possible arithmetic list appearing in a Code Rule. Similarly, routines which look up entries in the Symbol Table, will often reset the value of these variables. The new values can be checked using the standard comparison tests allowed in arithmetic lists. For example:

< TYPE=1 >  ....... /  .........

might appear in a Code Rule to define different actions depending on the TYPE value for a Symbol Table entry.

7.3 Symbol Table Routines

A number of routines are defined for manipulating Symbol Table entries. They may appear in arithmetic lists in the same way as the routines defined in Section 6.5. The routines are:

ENTER[nodename]

this routine is used either to define a new Symbol Table entry or to change the value of TYPE or VALUE in an existing one. The nodename defines a branch in the tree which is a string. This string is the name of the entry under consideration. This, together with the current setting of LEVEL, uniquely defines the entry. The Symbol Table is scanned to see if an entry exists with this name and LEVEL value. If no entry is found, then this new entry is added to the Symbol Table. The current values of TYPE and VALUE are associated with this new entry. If the entry already exists, the current values of TYPE and VALUE replace the previous contents for this entry. For example, the rule:

DEC[-,-]=> < TYPE<-1 ; VALUE<-3 ; LEVEL<-2 ; ENTER[*2] >

might be called with a pointer to the tree:

Fig. 7.1

This would define a Symbol Table entry with name ABC and LEVEL set to 2. Associated with the entry would be the values, TYPE=1 and VALUE=3.

The full complexity of nodename forms may be used. Thus ENTER[*1:*2] is allowed.

LOOK[nodename]

this function is used for looking up an entry in the Symbol Table and returning the values associated with the entry to the variables TYPE, LEVEL and VALUE. The nodename defines a branch in the tree which is a string. This is the name to be looked up in the Symbol Table. If an entry with this name is found, the function returns a true value, otherwise it returns the value false.

If more than one entry with this name exists in the Symbol Table, the one with the largest LEVEL value is taken. The variables TYPE, LEVEL and VALUE are set to the corresponding entries for this Symbol Table entry. For example:

ADR[-] => < LOOK[*1] > < OUT [VALUE] > / 'ERROR' ;

If this rule is called with the node ADR having the string ABC as the only branch, the name ABC will be looked up in the Symbol Table and, if it exists, the corresponding VALUE will be printed. If an entry with name ABC does not exist, the message ERROR is printed. The function LOOK should always be the last item in an arithmetic list, otherwise the Boolean result will have no effect.

CLEAR[expression

this function removes entries from the Symbol Table. The result of the function is the number of entries removed. Those entries, with a LEVEL value greater or equal to the value of the expression given as argument, will be removed. For example:

END[] => < T<-CLEAR[0] ; OUT[T] > ;

will remove all the entries from the Symbol Table, set T to the number of entries removed and print its value. In a compiler for a block-structured language, the entries for a particular block LEVEL value will be removed when the end of the block is reached. This ensures that only those variables which still have the correct scope can be accessed. By always choosing the entry with the highest LEVEL found on look-up, the correct variable is chosen if several nested blocks have declarations involving variables with the same name.

7.4 Scanning the Symbol Table

A special output rule, the Symbol Rule, allows the same operation to be performed on every entry in the Symbol Table. An example is:

SC :=  'DEFINE ' *1 ' EQU ' < OUT[VALUE] > % ;

The action of the rule is as follows:

Select an entry from the Symbol Table.
Set the variables TYPE, LEVEL and VALUE to the corresponding values in the Symbol Table entry.
Call the Symbol Rule as though it were a Code Rule with the name of the entry as the tree being considered.
Repeat for all entries in the Symbol Table.

Thus, *1 on the right hand side of the rule will output the name of the Symbol Table entry being considered. No other nodename is allowed in a Symbol Rule. This is the only limitation. There is no reason why, for example, the Symbol Rule should not call other Code Rules. However, it must not call another Symbol Rule.

A Symbol Rule is executed by calling it from a Code Rule as though it were a Code Rule with no arguments. For example:

ENDP[] => SC[] 'END' % ;

would call the Symbol Rule defined above. The result would be a set of lines of the form:

DEFINE FRED EQU 27

where FRED is the name of a Symbol Table entry and the number 27 is the VALUE entry for FRED.