Jump Over Left Menu
4. CODE RULES
coderule ::= identifier outrulelist ; | simplecoderule ; outrulelist ::= outrule | outrulelist outrule outrule ::= [ nodetest? ] => outexpression nodetest ::= item | nodetest , item item ::= - | identifier [ nodetest? ] | string | nodename | label | basictype basictype ::= .NUM | .ID | .OCT | .HEX | .SR | .CHR | .DIG | .LET outexpression ::= outalternative | outexpression / outalternative outalternative ::= outitem | outalternative outitem outitem ::= outputtext | nodename | identifier [ arglist? ] | arithmetic | (outexpression) | .EMPTY | label outputtext ::= % | string | !string | @ integer nodename ::= * integer | nodename : * integer arglist ::= argmnt | arglist , argmnt argmnt ::=nodename |label |string simplecoderule ::= identifier / => simpleoutexpression | identifier / => .EMPTY simpleoutexpression ::= outputtext | simpleoutexpression outputtext label ::= #1 | #2 | #3 | #4
4.2 Code Generation
Code generation is initiated when an asterisk is encountered in a Syntax Rule that is being obeyed. At this stage, a complete tree should have been produced and the stack will contain a pointer to the top node of the tree. Initially, the Code Rule which has the same name as the top node of the tree will be called. In the example given in Section 3.8, the top node of the tree will have the name STORE. Therefore, a Code Rule with this name must have been defined. A typical rule might be:
STORE [ -,.ID] => 'LOAD ' *2 ' STORE ' *1 [-,-] => *2 ' STORE ' *1 ;
A Code Rule consists of two main parts. The first gives the name of the rule (STORE in the example) and the second gives a number of possible nodes of this form and how they are to be analysed (outrules). The final outrule in the set is followed by a semicolon. Outrules can be defined which deal with specific subsets of the nodes with a particular name. These allow different code to be generated, depending on the form of the tree. In its simplest form, a Code Rule consists of a single outrule. For example:
STORE[-,-] => *2 ' STORE ' *1 ;
Each outrule consists of two parts. The first part is the nodetest, enclosed in square brackets, which defines the type of node that it can handle. The second part, following the => symbol, defines the action to be taken for this type of node when it is encountered. The example above contains two of the most frequently-used output items. An item consisting of the symbol * followed by an integer, I, will cause the Code Rule whose name is the same as the Ith branch of the current node to be called. If the Ith branch is a leaf of the tree which is a string, this string will be output. The appearance of a set of characters inside string quotes in a Code Rule will cause the string to be output when this item is obeyed. In the example above, the second branch will be called. On return, the characters STORE will be output and then the first branch will be called. In the case of the STORE node of Section 3.8, the first branch must be an identifier. The characters of the identifier are therefore output.
When a Code Rule is called, the nodetests of the individual outrules are compared, in turn, with the tree node being considered. The first outrule whose nodetest matches the current node will be executed. If none of the outrules correspond to the particular tree node, no outrule is executed and the value false is returned to the calling rule.
As the right hand side of an outrule is being obeyed, it is likely that other Code Rules will be called whose names appear as tree nodes further down the tree. These, in turn, will call other Code Rules so that the initial examination of the top tree node will cause the whole tree to be processed.
Once the execution of the initial Code Rule called is complete, control will return to the Syntax Rule immediately after the asterisk which initiated the code generation. Further input will be analysed until another asterisk is encountered. There is, therefore, a continual switching between syntax analysis and code generation.
4.3 Node Names
In Code Rules there is often a need to specify a node or leaf of the tree being decoded. The Code Rule is entered with a pointer to the part of the tree to be analysed. The node pointed at will be called the current node. This can be either the top node of a sub-tree or a single leaf. If it is a node, it will have a number of branches (or possibly none) which point to other nodes or leaves. A nodename will specify a particular node associated with the current node. In its simplest form, a nodename consists of an asterisk followed by an integer which indicates the particular branch of the current node to look at. For example, *2 specifies the node at the end of the second branch (reading from left to right) of the current node. Subnodes of this node can be specified in a similar manner. The original nodename is followed by the symbol : and then the branch to be taken from this node. For example:
refers to the node attached to the second branch of the node which is attached to the first branch of the current node. Nodenames of this form can continue to the complete depth of the tree. To decode a nodename, it should be read from left to right with an imaginary pointer at the current node. As each branch is specified, the pointer should be moved to the corresponding node attached to this branch. The- node specified is the one at which the pointer finishes. Consider the tree:
Fig. 4.1: TREE-META tree
The nodename *2:*3 refers to the third branch of the MULT node which is the ADD node. The nodename *2:*3:*1 refers to the leaf G.
4.4 Node Tests
Each nodetest consists of a set of items separated by commas. The number of items indicates the number of branches expected for the node being considered. The simplest item possible is the symbol, -, which indicates that the branch may point to a node of any form. For example:
STORE[-,-] => .....
is a Code Rule which can deal with any STORE node having two branches attached to it. Similarly:
will deal with a STORE node not having any branches attached to it.
The positions in the item list refer to the branches in order, from left to right. The complete set of items allowed is:
- - : indicates that the node attached to this branch may be of any type. It can be a node or a leaf.
identifier [nodetest?] : this defines the precise form of the node attached
to this branch. The identifier is the name of the node and the nodetest specifies the form
of the branches attached to it, just as the complete nodetest specifies the type of node
that the Code Rule can deal with. For example:
STORE[-,MULT[-,-],-] => ....is a Code Rule which will deal with STORE nodes having three branches. The middle branch must point to a MULT node having two branches attached to it. This specification can continue to any depth. Fer example:
STORE[-,MULT[.ID,ADD[-,-]],-] =>The MULT node must now have an identifier at the end of the first branch and an ADD node with two branches at the end of the second branch.
string : indicates that the node attached to this branch must be the specified
string. The string will have been added to the tree after recognising one of the basic
types .SR,.NUM,.OCT,.HEX or .ID or by specifying the particular string in the Syntax Rule.
S = .'+' .ID '*' .NUM '-' .SR :ST ;would recognise an input line: +ABC*27-'DEF'. The tree generated could be decoded by a rule:
ST['+','ABC','27','DEF' ] => ....Note that the string recognised by .SR is stored without the quote symbols themselves. If any identifier other than ABC had appeared in the input line, this rule would not be appropriate.
nodename : it is possible to test for the case of two branches from a node
pointing to the same string. This item specifies the position in the tree of the string which
should be identical to the one at this position. The full complexity of nodenames is allowed.
STORE[-,*l] => ....will deal with STORE nodes having two branches pointing to the same string. Similarly:
STORE[*3,-,-] => ....will deal with STORE nodes having three branches with the first and third branches pointing to the same string. A more complex rule is:
STORE[*2:*2:*1,-] => ....which would deal with STORE nodes similar to:
Fig. 4.2: TREE-META tree
- label : see Chapter 5
basictype : indicates that the specified branch has a leaf and not a node
attached to it. The leaf should be the specific basictype indicated. For the
basictypes .ID, .NUM, .OCT, .HEX and .SR, there must be complete agreement with
the particular basictype recognised when the tree was generated. For example:
ST = .OCT :SP ;would generate a node, with name SP, which could not be decoded by:
SP[.NUM] => ....There is, however, some relaxation for the classes .DIG and .LET, which are considered as subsets of the class .CHR. For example:
ST = .DIG .CHR :SP ;could have the SP node decoded by:
SP[ .CHR, .DIG] =>as long as the character recognised by .CHR was a digit. It is sensible to match completely the basic type that was used at recognition with the one in the nodetest.
4.5 Output Expressions
If an outrule is defined with a nodetest which matches the tree to be decoded, the output expression on the right-hand side of the rule is executed. This may consist of a set of alternatives separated by the symbol /. Each alternative will consist of a set of items which may or may not return the value true or false. The first item of each alternative should be capable of returning a truth value. If the first item of the first alternative returns a value true, then the rest of this alternative is executed. If a false value is returned, the next alternative is tried. This process is repeated until a first item returning a true value is found and this alternative is executed. If no first items return a true value, no alternatives are applicable and the whole Code Rule returns a false value. For example:
STORE[-,-] => *1 'FIRST' / *2 'SECOND' ; ALPHA[-] => 'ALPHA' ; BETA [-] => 'BETA' ;
would decode the tree:
Fig. 4.3: TREE-META tree
The STORE node does have two branches and so the STORE rule will be executed. The first item, *1, will call the Code Rule ALPHA. However, the rule defined deals only with ALPHA nodes having a single branch. . This item will therefore return a value false and the second alternative will be tried. This has the correct form of BETA node. The code produced will be:
Those output items in the class output text and .EMPTY always return a true result while other items may return a true or false value. Consequently, neither .EMPTY or outputtext items should appear as the first item of any alternative other than the last one.
In its simplest form, the output expression consists of a single alternative.
4.6 Output Text
The simplest form of item appearing in an alternative of an output expression is one which outputs text. The possible items of this type are:
String : the string appearing in the output expression is output without the string quotes.
STORE[-,-] => 'ABC' 'DEF' ;would output the text ABCDEF if this rule was executed.
Newline : the symhol % appearing in an output expression will cause a
newline symbol to be output. For example:
STORE[-,-] => 'ABC' % 'DEF' %would cause the strings ABC and DEF to be output on separate lines.
Character : the symbol @ followed by an integer will output the character
with this 1900 internal code value (see Appendix 1).
STORE[-,-] = 'ABC' @10 'DEF' @26 %would output:
ABC:DEF*followed by newline.
- Leaf : a leaf of the tree can be output by specifying the appropriate nodename (see Section 4.7).
4.7 Decoding Nodes
The main method of calling Code Rules from within a rule, is to specify a node which is attached to some branch of the current node being decoded. The node is specified by a nodename (see Section 4.3). For example, *2, appearing in an output expression, would cause the current Code Rule to be halted and the Code Rule associated with the node attached to the second branch of the current node, to be called. If the branch specified points to a leaf of the tree rather than a node, the string or character defined at the leaf will be output. Once the Code Rule has been executed, control will return to the first Code Rule immediately after the point where it was halted. For example:
ST[-,-] => *1 *2 ;
might be called with the tree:
Fig. 4.4: TREE-META tree
The output expression *1 will cause the Code Rule ALPHA to be obeyed. On return, the second branch will be obeyed. As this consists of the string XYZ, it will be output.
4.8 Null Item
An output expression can contain the item .EMPTY. The only result of this item is to set the truth value to true. It can be used to define an alternative without items. For example:
ST[-,-] => *1 *2 / .EMPTY ;
If the result of calling the Code Rule associated with the first branch is false then no action takes place; otherwise the Code Rule associated with the second branch is executed.
The .EMPTY item can be used to give information about the form of a node (see Section 4.9).
4.9 Calling Code Rules Directly
It is possible to call Code Rules other than those associated with the nodes of the current tree by calling them directly. The item in the output expression consists of the name of the Code Rule followed by a set of arguments in square brackets. The effect is to generate a node with this name and which has branches specified by the arguments. The individual arguments must be either nodenames, strings or labels (see Chapter 5 for labels as arguments). For example:
CALLP[-,-] => OPNAME['ABC',*2:*1]
might be called with the tree:
Fig. 4.5: TREE-META tree
The call of OPNAME will generate the tree:
Fig. 4.6: Generated tree
before calling the Code Rule with this name. The new tree is generated by pointing at the old tree. There is, therefore, no great overhead with using this facility even when large trees are involved.
The main uses of this facility are:
Avoiding Repetition: if the same or similar output expressions appear in a
number of rules, a new rule can be defined containing these items. A call of this rule can then
replace the original items. Therefore, it acts as a subroutine facility. For example:
OUTSR[-,-] => *1 @23 *2 @23 ; OUTHEX[-] => @23 * 1 @23 ;could be replaced by:
OUTSR[-,-] => *1 OUTQ[*2] ; OUTHEX[-] => OUTQ[*1] ; OUTQ[-] => @23 *1 @23 ;
Setting Truth Values: it is often useful to have rules which just check the
form of the tree without producing any output. For example:
STR[.SR] => .EMPTY ; IDENT[.ID] => .EMPTY; NUMBER[.NUM] => .EMPTY [.HEX] => .EMPTY [.OCT] => .EMPTY;These three functions check whether an argument is a string, identifier or number and return the value true or false. They can be used to control the output generated. For example:
STOR[-] => STR[*1] 'STRING' / IDENT[*1] 'IDENTIFIER' / NUMBER[*1] 'NUMBER' / 'UNKNOWN ARGUMENT' ;This Code Rule outputs the appropriate message depending on the argument.
Scanning the Tree Twice: It is sometimes useful to scan a tree twice.
It may be necessary, for example, to collect certain information together about the tree
before generating the code. This can be done by calling additional Code Rules.
The following program shows how declarations for variables appearing ~n an arithmetic expression may be output before the code for the expression:
. META EXP EXP = . ID $ ( '+' ( .ID / .NUM ) :ADD ) ';' :EY * ; EY[-] => DEC[*1] *1 % ; DEC[ADD[-,-]] => DEC [*1:*1] DEC[*1:*2] [.NUM] => .EMPTY [.ID] => 'INTEGER ' *1 % ; ADD[-,-] => *1 ' PLUS ' *2 ; .END
This program accepts arithmetic expressions of the form:
The tree consists mainly of binary ADD nodes. The tree is first processed by the DEC rule which generates declarations for all the identifiers. The ADD rule then generates code for the tree. The complete output is:
INTEGER ABC INTEGER DEF ABC PLUS 27 PLUS 53 PLUS DEF
Another example of scanning the tree twice is given in Section 6.7.
One of the items that can appear in an output expression is a general output expression enclosed in brackets. This is scanned in much the same way as a complete right-hand side of a Code Rule. It can have a number of alternatives and may return a truth value. For example:
ABC[-,-] => *1 ( SIMP[*2] 'DEF' / 'GHI' ) % ;
Here, the expression in parentheses either outputs the string 'DEF' or 'GHI' depending on the result of calling the Code Rule SIMP. Just as brackets in arithmetic expressions save the user introducing additional variables, this construct saves the user adding extra rules. the above rule could have been written:
ABC[-,-] => *1 DUM[*1,*2] % ; DUM[-,-] => SIMP[*2] 'DEF' / 'GHI' ;
Bracketing may continue to any depth. The truth value of the expression is the same as that produced by the last item obeyed. For a set of alternatives, the result is true if one of the alternatives is executed, otherwise it is false.
4.11 Simple Code Rules
If the same action is required for all nodes of a particular type, a simpler and more efficient Code Rule is provided. The complete nodetest is replaced by the symbol / to indicate that any node with this name is handled. The right-hand side of a Simple Code Rule is restricted to outputting a string, a character or a newline, and the class .EMPTY. It cannot, therefore, access any information in the tree connected with this particular node, nor can it generate labels. It is mainly provided as a means of including dummy rules. For example:
OPT /=> 'ABC' @23 % ; OPT /=> .EMPTY ;