Sections 2 to 5 of this report describe four statistical programs written in FORTRAN for the IBM 7090, IBM 7030, and ICT-ATLAS. Each program uses a package of data input subroutines described in reference 1 and therefore have a number of features in common. Section 1 is an introductory section describing these common features and it is therefore relevant to all four programs. Thus the reader interested in only one program should read section 1 and the section describing that program.
The programs to be described in this report are developments of programs written by members of the statistics Section of the Atomic Energy Research Establishment, Harwell. The Multivariate Analysis program is at present written in S2 for the IBM-7030 and has been written by Mrs. M. C. B. Russell. The later debugging of this program has been continued, on a voluntary basis, by Mrs. Russell after having left the Atomic Energy Authority. The author is particularly grateful to Mrs. Russell for this very valuable work. The complete Factorial Experiment program was written originally for the IBM-7090 by Mrs. C. M. Whiteside and has since been re-written for the ICT-Atlas using new methods by T. Gover of the Atlas Computer Laboratory. The Regression-Within-Groups has been written by the author for the IBM-7090 and the Diallel program has been written by the author in S2 for the IBM-7030. It is intended to have versions of all four programs available on the IBM-7090, the IBM-7030 and the ICT-Atlas. The data preparation will be the same on all three machines. Slight operational differences will exist, however, because of the operating systems used to operate the machines.
The main difference is concerned with the use of a common working tape as temporary storage on the IBM-7090 and IBM-7030 which is not necessary on the ICT-Atlas. Brief details of these differences are given in section 1.5.
(1) DIP - A Package of Data Input Routines written in Package, by B. E. Cooper, Atlas Laboratory Report ACL/R 1 (in preparation).
(2) The Presentation of Experimental Data to Computer, by B. E. Cooper and Mrs. C. M. Whiteside, A.E.R.E. R4250.
(3) The Analysis of Variance of Diallel Tables, by B. I. Hayman, Biometrics Vol. 10, 1954.
(4) The Analysis of Continuous Variation in a Diallel Cross of Nicotina Rustica Varieties, by J. L. Jinks, Genetics, 39, 1945, p 767-788.
The programs to be described in this report make use of a package of data input subroutines which are described in detail in a separate report (reference 1). A similar package of subroutines have been described in reference 2. Detailed knowledge of the package is not necessary to an understanding of the use of these programs although a quick reading of section 1 of reference 1 or reference 2 would be found helpful. The data presentations to these programs have a number of general features in common. These are described in this first section and the detailed structures of the data presentations are described in the remaining sections.
The information required by a program for one case normally consists of a title for identification purposes, information specifying the amount of data and the analyses required, and the data to be analysed.
We define a case as that collection of information which is self-contained and which could be presented to the program without reference to any other information or data. A case may consist of information for several variables and more than one analysis may be requested. A user of a program (a customer) normally wishes to present several cases to the computer at one time, and two or more customers may wish to use the program at the same time. These programs allow this to be done and the data deck presented to the computer may consist of several subdecks originating from different customers. Each subdeck may consist of any number of separate cases. It is clear that the cases presented must be adequately labelled so that the solutions can be identified with the original data. It is also clear that the separate subdecks must also be labelled so that the solutions can be returned to the correct customer. This is achieved by the use of three control cards which are defined and used as follows:
CUSTOMER B. E. COOPER ATLAS COMPUTER LABORATORY.
TITLE DATA FOR EXPERIMENT 14B PERFORMED 6/6/64.
Thus the data deck is divided into subdecks by CUSTOMER cards, each subdeck is divided into cases by TITLE cards and the complete deck is terminated by a FINISH card. The following example, which consists of three cases for one customer followed by two cases for a second customer, shows the structure of a data deck.
Example Data Deck 1. CUSTOMER B. E. COOPER ATLAS COMPUTER LABORATORY | TITLE EXAMPLE CASE 1 | Cards for example case 1 | TITLE EXAMPLE CASE 2 | 3 cases for customer B. E. COOPER Cards for example case 2 | TITLE EXAMPLE CASE 3 | Cards for example case 3 | CUSTOMER A. N. OTHER XYZ UNIVERSITY | TITLE EXPERIMENT 12 DATE 4/5/64 | Cards for experiment 12 | 2 cases for customer A. N. OTHER TITLE EXPERIMENT 14 DATE 6/5/64 | Cards for experiment 14 | FINISH | End of data deck
The detailed structure of an individual case depends, of course, on the particular program but the following general structure is common to all programs. The presentation for one case consists of the following four sections.
POINTS 40 VARIABLES 3 NAMES ALPHA BETA GAMMAThe specification section may also contain equation cards defining operations to be performed on the data. Variables may be transformed in this way or new variables may be defined in terms of those to be presented as data. Example equation cards are:
LOGA = LOG(A) DIFF = CHEST - WAISTThe rules determining the types of equation depend on the equation interpretive sub-routine included in the data input package. The rules which apply to the use of equation cards in these programs are given in references 1 and 2, and briefly summarised in section 1.7. The rewriting of the equation interpretive sub-routine would introduce a fresh set of rules to all programs.
COMPONENT ANALYSIS ON ALL VARIABLESIn some programs the total absence of instruction cards implies that a standard analysis is to be performed. The Regression-within Groups program described in section 3 does not allow the use of equation cards in its present form but transformation of the data is achieved by other, less elegant, means. Instruction cards are not used in this program since only one analysis is possible.
The specification and instruction sections use words to introduce the parameters, for example VARIABLES 3 is used to specify that three variables are to be presented. Since each parameter is introduced by a distinct word the omission of the word altogether may be taken to imply a standard value for the parameter. The omission of the word VARIABLES, for example, could be taken to imply that only one variable is to be presented. Description of the specification section and the instruction section of a program may therefore be made by listing the words which may be used and by describing the type of information introduced by each word.
Fach program produces two output streams. The first stream contains all the results as well as any necessary diagnostic comments drawing attention to presentation errors that have been detected. The second output stream or commentary output contains a very brief record of the success or failure of each case and is split up according to CUSTOMERS. If a case is successful the case title and the comment CASE COMPLETE are recorded in the commentary output. If a case is not successful the case title, one or more diagnostic comments drawing attention to the error, or errors, and the comment CASE NOT COMPLETE are recorded in the commentary output. Thus the commentary output enables unsuccessful cases to be identified quickly and it provides a list of those cases that have been presented to the computer. The organisation of the commentary output depends on the machine on which the program is being run and these details are given later in section 1.5.
The arrangement of data on the cards in the data section depends on the particular program but the following important features are independent of the data layout and are available in all programs. It is permissible to include a statement in the specification section to specify checks to be applied to the data. The example statement
CHECK 1 ASCENDING 4 IDENTICAL
specifies that the first item on each data card is expected to form an ascending sequence from card to card and that the fourth item on each data card is expected to be the same on all cards. Only one CHECK statement is allowed but any number of checks may be specified in the statement. The statement is introduced by the word CHECK and each check is specified by a number - (the position of the item to be checked) - and a word - (the type of check to be performed). There are three types of checks that can be performed and these are selected by the words
If an error is discovered on reading the data a comment is produced in the main output stream and in the commentary output stream specifying the card on which the check failed and the item being checked.
The second feature allows the customer to specify that certain items (numbers or words) are to be ignored since they do not form part of the data to be analysed. This facility is selected by the inclusion of an IGNORE statement in the specification section. The statement consists of the word IGNORE followed by a list of integers specifying the positions on the cards of the items to be ignored. The statement
IGNORE 1 2 7 10
specifies that the 1st, 2nd 7th and 10th items on each card are to be ignored. This facility allows extra information such as labels or data not required for the present analyses to be included on the data cards so that these can form a complete and clearly labelled record of all the data that was collected.
The same item may be both checked and ignored so that an item, or items, may be included on each card solely as a check that the presentation order is correct. It is also possible to check an item which is not to be ignored.
The main difference between the operation of these programs on the three computers concerns the commentary output stream. On Atlas the commentary output is produced automatically by the supervisor as output stream 7 and the main output is produced as output stream 0. The job description for Atlas must therefore include statements specifying that output streams 0 and 7 are required to appear on a line printer. That is the statements
OUTPUT 0 LINE PRINTER 3000 LINES 7 LINE PRINTER 500 LINES
The number of lines specified for each output stream depends on the particular program, the analyses required, and the number of cases presented.
On the 7090 and 7030 computers the commentary output is written on to tape 7 and copied from tape 7 onto the main output stream when the Finish card is read at the end of the data deck. Thus tape 7 (B3 on the 7090) is used as a working tape which may be returned to common use at the end of the job. It is clear, therefore, that the omission of the FINISH card would cause the commentary output to be lost. The operating instructions for the 7090 and 7030 must include the use of a temporary working tape as tape B3 and tape 7 respectively. That is the tape is a COMMON tape before and after the program execution.
The following rules apply to the punching of words and numbers on specification, instruction and data cards.
1.4, +4, -3.14, 27F2, 27.8E-1, -0.43E+6, +1.8, 44.39E-4
The basic rules for punching words and numbers on equation cards are the same as for other cards and these have been given in section 1.6.
The additional punching rules concern the special characters:
= ( ) * /
as well as + and - signs punched before a word. Since words and numbers on an equation card are separated by one of these special characters the division between items is clear. Equation cards may therefore be punched with or without spaces as desired.
Equation cards specify arithmetic operations to be performed on variables or on single values (referred to as parameters). Certain equation cards define a parameter as a function of a variable. An example of this type of function would be
AMEAN = MEAN(A)
in which the parameter AMEAN is defined as the mean of the variable A. Such parameters may be used in further equation cards but their values are not available to the program so that their definition is only useful if they are used in further equation cards. Four types of equation cards are possible and these are defined below.
Type 1 equations define simple arithmetic operations to be performed on variables or parameters.
Examples are:
DIFF = CHEST - WAIST A = A - AMEAN
and the following rules apply:
Violation of these rules causes the variable or parameter on the left hand side to be deleted so that analyses involving this variable will not be performed.
Type 2 equations define functional relations between two variables, between two parameters, or between a parameter and a number. The following rules apply:
A = LOG(B) ai = log10(bi) i = 1,n A = ANGLEA(B) ai = Arcsin((bi)½) i = 1,n A = ANGLEB(B) ai = Arcsin((bi/100)½) i = 1,n A = RECIPE(B) ai = 1/bi i = 1,n A = CUBERT(B) ai = (bi)⅓) i = 1,n A = SIN(B) ai = sin(bi) i = 1,n A = EXP(B) ai = e-bi i = 1,n
Violation of these rules cause the deletion of the variable or parameter on the left hand side.
Type 3 equations define the value of a parameter as a function of the values of a variable. The following rules apply:
P = SUM(B) The sum of the values of B P = MIN(B) The lowest value of B P = MAX(B) The maximum value of B P = MEAN(B) The average value of B P = VAR(B) The variance of B
Violation of these rules cause the parameter on the left hand side to be deleted.
Type 4 equations are a miscellany of equations and are listed below:
The rules listed above are those which apply because of the particular version of the equation interpretive subroutine that is included at present in the data input package. If this subroutine is rewritten, and it is hoped to rewrite it soon; a fresh set of rules will apply. The purpose of rewriting would be to implement more flexible rules but a new set of rules would still allow the use of equations that are consistent with the rules given above.
Three programs allow distinction between repeat and replicate observations. The regression within groups program allows several observations of the dependent variable for each value of the independent variable, the complete factorial experiment program allows several observations for each combination of the factor levels, and the diallel table program allows several observations for each parental combination. Observations are repeats rather than replicates when sources of variability present between points are not present between several observations for one point (the word point is used to refer to different values of the independent variable, different factor level combinations, or different parental combinations). If, for example, different points correspond to different solutions in a chemical problem and several observations are made on each solution the variability due to differences between solutions will not be present between observations made on the same solution. If this is the situation although this source of variation is present in the analysis it is not valid as a residual against which other sources may be tested because a significant result may simply reflect the known differences between solutions. The programs recognise this situation by the use of the word REPEATS instead of the word REPLICATES.
Each program expects (with one exception described in section 3.4.2) each data card in the data section to contain the same number of items and defines a minimum number (n) of items (excluding items to be ignored) to be included on one card. The descriptions of the data sections begin by defining this minimum number and go on to explain that any integral multiple (k) of n items may be punched on one physical data card. We therefore define a conceptual card as one set of n items and we define a physical card to be a card, actually read, containing kn items. Thus one physical card may represent k conceptual cards. Each program expects a fixed number of conceptual cards arranged on any number of physical cards.
The position on data cards of items to be ignored or checked refer to positions on physical cards. That is the number of items expected on any one physical card is kn + i where i is the number of items to be ignored. Thus if the number of items to be presented on a conceptual card is 6 and the items are of such size that 20 may be punched on a physical card then we may punch
a) 6 + i where i ≤ 14 (k = 1) b) 12 + i where i ≤ 8 (k = 2) c) 18 + i where i = 1, or 2 (k = 3)
items on any physical data card.
The multivariate analysis program has been written as a general program capable of performing a number of different analyses. In its present form only two types of analyses are included but two further analyses are being added. The presentation has been arranged so that other analyses can be added easily. The program can perform principal components analysis and a number of variants of regression analysis including multiple regression and variation descriptive regressions. Polynomial regression, canonical correlation and factor analysis are being added to the program and these are described in this report. It is intended to include group analyses such as Hotelling's T and discriminant analysis in the near future and the presentation has been already arranged to accept several groups of data.
The specification section consists of cards containing the values of parameters necessary to the input of the data such as the number of points, the number of variables and the names of each variable. Each parameter, or set of parameters, is introduced by a word, for example, POINTS 30, VARIABLES 6 and NAMES HEIGHT WEIGHT CHEST BACK LEG WAIST. The introductory words may appear in any order and may be punched in any columns on the cards in this section with the only restriction that the value of the parameter, or set of parameters, must immediately follow the introductory word on the same card. The CHECK and IGNORE facility described in Section 1.3 are available in this program. Equation cards (section 1.7) defining new variables or transforming existing variables may be included in the Specification Section.
The data section consists of one card for each point and the values of each variable are punched across the card in the order in which the variables are named in the specification section. The analyses that can be performed by the program in its present form involve only one group of data. Analyses involving more than one group are planned and the data presentation has been arranged to accept more than one group. The data section thus consists of a number of separate groups of data all prepared in the same way. Each group of data is preceded by a GROUP card consisting of the word GROUP followed by a number. This number will be used in any diagnostic comments that are produced by the program. If only one group is presented the group card may be omitted. If several groups are presented to the program in its present form each analysis requested in the instruction section is performed separately for each group of data. It is possible to include an identification number for each point which will be used as a label during output. It is also possible to present the variable means and the variance-covariance matrix instead of the raw data.
The instruction section consists of any number of instructions requesting analyses to be performed on the data or on part of the data. Words are used to request analyses so that the instructions are given in a language which is very close to normal English. Equation cards are also allowed in the instruction section.
All cards are punched according to the card preparation rules given in section 1.6.
Equation cards conform to the additional rules given in section 1.7.
The following example presentation will form the starting point for the detailed description of the presentation rules.
CUSTOMER A. N. OTHER BUILDING A.9 TITLE EXAMPLE 4 (MULTIVARIATE PROGRAM) VARIABLES 6 NAMES HEIGHT WEIGHT CHEST BACK LEG AND WAIST POINTS 10 IGNORE 1 2 6 CHECK 1 IDENTICAL 2 ASCENDING HEIGHT = HEIGHT * 12.0 DIFF = CHEST - WAIST 41 11 6.01 175 40.4 A 24.3 32.1 30.2 | 41 14 5.73 171 36.9 B 23.1 30.8 31.9 | ............................... | 10 lines of data 41 32 6.24 182 38.4 B 27.4 34.0 34.6 | COMPONENTS ANALYSIS USING ALL VARIABLES EXCEPT WEIGHT AND DIFF REGRESSION ANALYSIS OF WEIGHT ON HEIGHT CHEST AND LEG REPEAT OMITTING LEG POLYNOMIAL REGRESSION ORDER 3 OF WEIGHT ON CHEST RATIO= CHEST/BACK REGRESSION OF RATIO ON WAIST TITLE EXAMPLE 5 (MULTIVARIATE PROGRAM) VARIABLES 7 NAMES A B C D E F G IDENTITY 3 POINTS 30 IGNORE 1 2 CHECK 1 IDENTICAL LOGA = LOG(A) H=B-F TEST (B 10.0 20.0) 7 29.4 11 1.43 12.42 3.04 0.413 4.31 8.29 11.31 | 7 27.2 23 2.91 17.49 3.09 0.427 5.79 3.94 10.98 ! ................................. | 30 lines of data 7 24.1 53 7.83 15.28 4.17 0.719 12.21 6.21 20.01 | CANONICAL CORRELATION OF LOGA, B, C WITH E AND G REPEAT OMITTING LOGA REGRESSION OF A ON BEST SINGLE OMITTING H REGRESSION OF A ON BEST PAIR OMITTING B AND F FINISH
The parameters that are specified in this section are introduced by words as described in the following list, listed below together with description of the information they introduce. The use of some words is optional and their omission implies a standard value(s) for the parameter(s) introduced.
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
GROUPS | One number. | The number of groups. Optional word; one group is assumed if this word is not presented. |
POINTS | One number or K numbers, where K is the number of groups. | The number of points. If one number is given all groups are taken to have the same number of points. If K numbers are given the ith group is taken to have the number of points given as the ith number. |
VARIABLES | One number. | The number of variables to be read as data. |
NAMES | r words, where r is the number of variables. | The words are the names of the r variables to be read as data. |
VARIANCE or COVARIANCE or both |
No item following. | The presence.of either, or both, of these words specifies that the variance-covariance matrix is to be presented instead of the raw data. |
MEANS | No item following. | The presence of this word specifies that the variable means are to be presented with the variance-covariance matrix. This word is ignored if the raw data is presented. |
CHECK | Alternate numbers and words. | A number specifies the position on each data card of the item to be checked, the following word specifies the type of check to be made. See section 1.3. |
IGNORE | A list of integers. | Fach number specifies the position on each data card of an item to be ignored. See section 1.3. |
IDENTITY | One number. | Optional heading. If not used the successive data points are labelled 1 to n. If used the number following the word IDENTITY specifies the position on the card occupied by a point identification number that will be used to label the points in the output. |
The following words may be used in any position in the specification section to make the specification more like English. They have no information content for the program.and may be used more than once if required.
MATRIX AND VALUE
The words in the list above may be punched in any columns and any order on any number of cards. The information introduced by a word must, however, be punched to follow that word on the same card. That is, for example, the variable names introduced by the word NAMES must follow the word NAMES and the complete list must appear on the same card.
Equation cards must be prepared according to the rules given in section 1.7, and they may appear either before or after, or even mixed with the specification cards. No additional information may appear on an equation card. That is parameters such as POINTS 30 must not be punched on the end of an equation card.
All groups of data are prepared in the same way so that it is sufficient to describe the preparation of one group of data. Checks specified to be applied to the data are made on one group of data at a time. The statement CHECK 4 ASCENDING specifies that the fourth item on each card within one group is expected to form an ascending sequence and not that the fourth item on each card for all groups is expected to form an ascending sequence. IGNORE statements refer to all groups of data.
The first card of a group is a GROUP card consisting of the word GROUP followed by a number. The number is used to refer to the group in diagnostic comments and in the output of results. If only one group of data is to be presented this card may be omitted.
If the raw data is to be presented, that is if neither of the words VARIANCE or COVARIANCE have been included in the specification section, the following preparation rules apply. The values of all variables for one point are prepared on one card. The number of items on each card is expected to be the same and equal to the number of variables plus the number of items to be ignored plus one if a point identification number is given. In case one in the example presentation given above no point identification number is given, there are six variables and three items to be ignored so that nine items are expected on each card. The 1st, 2nd and 6th items are to be ignored so that the 3rd, 5th, 7th, 8th, and 9th items are the values of the six variables presented in the order given in the NAMES statement included in the specification section. In the second case the values of the seven variables A, .... ,G are punched as the 4th to the 10th items respectively since the 1st and 2nd items are to be ignored and the third item is the point identification number. The items to be ignored may be words so that word labels could be included on the data cards if desired. A check is made on the data cards that each card contains the correct number of items and that the ith item is either, a number on all cards, or a word on all cards. Error diagnostics are given in both the main output stream and in the commentary output stream.
In some problems it is convenient to present the variance-covariance matrix instead of the raw data. If the variance-covariance matrix is to be presented either or both of the words VARIANCE and COVARIANCE must be included in the specification section. If the variable means are to be presented as well as the variance covariance matrix the word MEANS must be included in the specification section. If the variable means are not presented in this way the value zero is assumed for all variable means.
The presentation of this information in the data section is as follows. The first card is a GROUP card consisting of the word GROUP followed by the group number. This GROUP card may be omitted if only one group is presented. The second card contains the variable means if these are to be presented. The variance-covariance matrix then follows with one row of the matrix per card. If the number of variables is n the data section consists of either n or n+1 cards each containing n values required by the program.
The CHECK and IGNORE facility is available in this situation so that extra information may be included on these cards if required. The point identification number facility described above does not apply here so that the number of items on each card is n plus the number of items to be ignored.
The presentation of the variance-covariance matrix and the variable means is now illustrated by the following example cases.
CUSTOMER B. E. COOPER ATLAS COMPUTER LABORATORY TITLE EXAMPLE WITH V-C MATRIX AND MEANS VARIABLES 6 NAMES WEIGHT HEIGHT SHOULDER CHEST WAIST AND HIP POINTS 120 CHECK 1 ASCENDING IGNORE 1 VARIANCE COVARIANCE MATRIX AND MEANS 0 138.250 67.558 16.3658 35.1950 28.0700 35.3083 1 194.4080 18.4842 5.4817 17.6895 15.5479 17.7164 2 18.4842 6.4296 0.5142 1.0467 0.1488 1.4945 3 5.4817 0.5142 0.5503 0.7929 0.4071 0.5617 4 17.6895 1.0467 0.7929 2.9892 1.9221 1.7032 5 15.5479 0.1488 0.4071 1.9221 3.1667 1.3634 6 17.7164 1.4945 0.5617 1.7032 1.3634 2.5421 TITLE EXAMPLE WITH V-C MATRIX BUT NO MEANS VARIABLES 6 NAMES WEIGHT HEIGHT SHOULDER CHEST WAIST AND HIP POINTS 120 CHECK 1 ASCENDING IGNORE 1 1 194.4000 18.4842 5.4817 17.6895 15.5479 17.7164 2 18.4842 6.4296 0.5142 1.0467 0.1488 1.4945 3 5.4817 0.5142 0.5503 0.7929 0.4071 0.5617 4 17.6895 1.0467 0.7929 2.9892 1.9221 1.7032 5 15.5479 0.1488 0.4071 1.9221 3.1667 1.3634 6 17.7164 1.4945 0.5617 1.7032 1.3634 2.5421
Each instruction is punched on a separate card but if one card is not sufficient to contain the instruction it may be continued onto a second (or more) card by use of the continuation character $. The character $ is punched onto the end of the card (or cards) to be continued and not on the continuation card unless that is also to be continued. Instructions may be made up of introductory words, listed below, and the information that these words introduce. Examples of the use of these words in instructions is given after this list.
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
REGRESSION | No item following | The presence of this word selects a regression analysis of some form. |
COMPONENTS | No item following | The presence of this word selects a principal components analysis of some form. |
CANONICAL | No item following | The presence of this word will select a canonical correlation analysis of some form when canonical correlation is included in the program. |
FACTOR | No item following | The presence of this word will select a factor analysis of some form when factor analysis is included. in the program. |
OF FOR |
List of variable names | Either of these words introduces a list of variable names referred to later as list 1. List 1 is a list of variables for one side of a canonical correlation analysis or the single name of the dependent variable in a regression analysis. |
DESCRIBE | A variable name or the word VARIATION | This word may be used to introduce an independent variable for a regression analysis in the same way as the words OF or FOR or it may be used to introduce the word VARIATION for a form of components analysis (see description of component analysis forms in section 2.5.1). |
AT | List of percentages or proportions | This word is used with the word DESCRIBE in statements specifying either that a variable is to be described in regression terms at each of a number of significance levels (section 2.5.2) or that the total variation is to be described at these levels by the first few principal components (section 2.5.1). |
ON USING WITH |
List of variable names or a selection from a list of special words described after this tabulation | Anyone of these words introduce a list of variable names referred to later as list 2. List 2 is a list of independent variables for a regression analysis, a second list of variables for a canonical correlation analysis or a list of variables for components or factor analysis. |
REPEAT | No item following | The presence of this word causes the previous type of analysis to be performed again. Other words included with the word REPEAT specify how the new analysis differs from the previous analysis (see next words). This word may not be used on the first instruction card. |
EXCEPT OMIT OMITTING |
List of variable names | ny one of these words introduce a list of names of variables to be excluded from an analysis. Used together with the word REPEAT or with a statement that all variables are to be included (EXCEPT those listed). |
ADDING | List of variable names | This word introduces a list of variable names to be added to the list in the previous analysis. Used together with the word REPEAT. |
FIRST | One integer | This word introduces the number of components required in a components analysis. If this is not specified all components are computed. |
No item following | The presence of this word selects additional printing in one type of regression analysis. This is described below. | |
GRADUATE | No item following | The presence of this word selects additional printing in regression analyses and in components analysis. |
The following words may be used in instructions in any position to make the instruction more like English. They have no information content for the program and may be used more than once if required.
AND THE ANALYSIS LEVEL PRINCIPAL COLUMN CORRELATION LEVELS PER CENT PERCENT COMPUTE
In the table given above there are pairs or triples of words that perform the same function. The use of any one of these words is allowed so that the one which fits better from an English point of view may be chosen.
The variables to be included in a components analysis may be listed in a number of different ways, for example the following instructions are legal:
1. COMPONENTS ANALYSIS USING ALL VARIABLES 2. COMPONENTS ANALYSIS USING ALL VARIABLES EXCEPT HEIGHT 3. COMPONENTS ANALYSIS USING WEIGHT CHEST HIP AND WAIST
The word EXCEPT may be replaced by either of the words OMIT or OMITTING and the word USING may be replaced by either of the words ON or WITH. In the third instruction the word USING is used to introduce a list of variables to be included in the analysis, whereas in the other instructions the word USING introduces the specially recognised words ALL VARIABLES. These two words are two of the words in the special list referred to above in the description of the use of the words USING, ON and WITH. The word EXCEPT in the second instruction introduces a list of variables to be excluded from the analysis.
The components analysis subroutine will normally compute all components unless instructed to the contrary. There are two methods by which the number of components to be calculated can be reduced. The first method specifies that only a fixed number are to be computed by including the word FIRST followed by the number of components required. A complete instruction of this type is:
COMPUTE FIRST 3 PRINCIPAL COMPONENTS USING ALL VARIABLES.
The second method specifies that sufficient components are to be computed to describe certain percentages (or proportions) of the total variance. This is achieved by using the words DESCRIBE and AT in the following way:
DESCRIBE VARIATION AT THE 95 PERCENT LEVEL DESCRIBE VARIATION AT THE 95 AND 99 PERCENT LEVELS DESCRIBE VARIATION AT THE 0.95 LEVEL
Notice that in these instructions the word COMPONENT is not necessary. A components analysis is selected by the presence of the first two words.
The inclusion of the word GRADUATE with a components analysis instruction causes the additional output of a graduation of the values for each point on each component that is calculated. This graduation is produced in numerical order for each component and the point identification number is printed with each component value. If no point identification numbers were provided points are numbered sequentially from unity in the order of presentation of the points.
A components analysis may also be selected by a REPEAT instruction repeating a previous components analysis with additional variables or with fewer variables.
Examples are:
COMPONENTS ANALYSIS USING ALL VARIABLES EXCEPT HEIGHT REPEAT OMITTING WEIGHT REPEAT ADDING HEIGHT
The normal regression analysis instructions use the word OF to introduce the dependent variable and the word ON to introduce a list of the independent variables, for example:
REGRESSION OF WEIGHT ON HEIGHT REGRESSION OF WEIGHT ON HEIGHT SHOULDER AND WAIST
The word ON (or USING or WITH) may be used with the words ALL VARIABLES as described in the previous section and the word EXCEPT (or OMIT or OMITTING) may be used to introduce a list of variables to be excluded from the analysis.
The use of the words ALL VARIABLES implies all variables except the dependent variable. Examples are:
REGRESSION OF HEIGHT ON ALL VARIABLES EXCEPI WAIST REGRESSION OF WEIGHT ON ALL VARIABLES
The word REPEAT may be used to specify a regression analysis if it follows a regression analysis instruction and the words ADDING and EXCEPT (and OMIT and OMITTING) may be used to change the list of independent variables as described in the previous section. The word FOR may be used with the word REPEAT to change the dependent variable.
Example instructions are:
REGRESSION OF WEIGHT ON HEIGHT SHOULDER AND CHEST REPEAT OMITTING CHEST AND SHOULDER REPEAT ADDING SHOULDER REPEAT FOR WAIST
The last instruction card above repeats the previous analysis with the dependent variable WAIST instead of WEIGHT.
The word DESCRIBE followed by a variable name will select a regression analysis with that variable taken to be the dependent variable. The word DESCRIBE is used with the word AT in an instruction of the type
DESCRIBE WEIGHT AT THE 0.99 LEVEL DESCRIBE WEIGHT AT THE 95 AND 99 PERCENT LEVELS
These instructions cause a large number of analyses to be performed. Firstly WEIGHT is regressed on each of the remaining variables in turn. If any of these variables. produce a sum of squares due to regression which is more than the required percentage (or percentages) of the total sum of squares then no more analyses are performed. The results for all analyses satisfying the conditions are output. If no single variable is sufficient then WEIGHT is regressed on each of the possible pairs of variables and the results for all pairs satisfying the conditions are output. If no pair of variables is sufficient the process continues taking three variables at a time, then four and so on. If n variables are found to be necessary the results for all groups of n variables that satisfy the conditions are output.
Several regression analyses may be requested at one time by the instructions
REGRESSION OF WEIGHT ON ALL SINGLES PAIRS AND TRIPLES REGRESSION OF WEIGHT ON BEST PAIRS
The first instruction causes output of the results for the regression of WEIGHT on each variable followed by the regression of WEIGHT on all possible pairs of variables followed by the regression of WEIGHT on all possible triples of variables. The second instruction causes the output of the results for the regression of WEIGHT on the two variables that give the best description of WEIGHT. That is the two variables which give the highest sum of squares due to regression. In these two instructions we see further special words that may be introduced by the word ON (or USING or WITH). The special words are BEST and ALL and each word introduces a list of words selected from the following ten possibilities
SINGLE PAIRS TRIPLES FOURS FIVES SIXES SEVENS EIGHTS NINES TENS
Thus we see that the word ON may be used to introduce:
A WARNING should be given that the terminal S in these ten words may not be omitted even though the English sense of an instruction using one of these words may suggest this.
The inclusion of the word GRADUATE with a regression analysis instruction causes the additional output of a graduation of the fitted function at the original values of the dependent variable. The graduation consists of the point identification number if provided, the observed value of the dependent variable, the fitted value predicted by the fitted function, the difference between the observed and fitted values and the standard error of the fitted value for each point.
The instructions which will select a canonical correlation analysis when this is available use the word OF to introduce the first list of variables and the word WITH to introduce the second list of variables to be used in the analysis, for examplea:
CANONICAL CORRELATION OF WEIGHT AND HEIGHT WITH WAIST AND HIP
The REPEAT statement may be used with the word OMITTING to select further analyses for example:
CANONICAL CORRELATION OF WEIGHT AND HEIGHT WITH WAIST AND HIP REPEAT OMITTING HIP
The word ADDING normally used with the word REPEAT to add further variables to an analysis may not be used to add variables to a canonical correlation analysis because the normal ADDING statement does not include specification of which list of variables is to be increased.
The number of co~relations to be computed may be restricted in the same way as the number of components in a components analysis by using the word FIRST. For example:
COMPUTE FIRST 2 CANONICAL CORRELATIONS OF WEIGHT HEIGHT AND BACK $ WITH WAIST HIP AND SHOULDERS
Use of the word GRADUATE selects additional output of a graduation of the correlation functions for each point in their original order, for each correlation computed. Each point will be labelled with its point identification number if these have been presented.
The instructions which will select a factor analysis when this is available use the word ON (or USING or WITH) to introduce the list of variables to be included in the analysis, for example:
FACTOR ANALYSIS ON WEIGHT HEIGHT WAIST AND HIP
The word ON may also introduce the words ALL VARIABLES and the word EXCEPT (or OMIT or OMITTING) may be used to introduce a list of variables to be excluded from the analysis, for example:
FACTOR ANALYSIS ON ALL VARIABLES EXCEPT SHOULDER
The REPEAT statement may be used with the words ADDING or OMITTING (or OMIT or EXCEPT) to select further factor analyses in the way already described in sections 2.5.1 and 2.5.2 for example:
FACTOR ANALYSIS ON ALL VARIABLES REPEAT OMITTING SHOULDER AND HIP REPEAT ADDING HIP
The number of factors to be computed may be restricted, in the same way as the number of components in a components analysis, by using the word FIRST, for example:
COMPUTE FIRST 3 FACTORS USING WEIGHT HEIGHT WAIST AND HIP
Use of the word GRADUATE will select additional output of a graduation of the factor loadings for each point, in their original order, for each factor computed. Each point will be labelled with its point identification number if these have been presented.
Instructions selecting other analyses will be added later and the words making up these instructions will be published when the analyses become available. The introduction of further instructions will not affect the presentation rules described here. The new instructions will be punched on cards in the same way as those described above and will be included at any position in the instruction section.
Equation cards prepared according to the rules given in section 1.7 may be included in the instruction section provided that equation cards defining new variables or redefining old variables appear before all instructions using the new or redefined variable.
The output obtained from the program consists initially of the variance-covariance matrix, the variable means and the correlation matrix all clearly labelled with the variable names listed in the specification section. This output is produced just before any instruction cards are obeyed and the output which follows depends on the analysis specified:
The output for a regression analysis consists of:
The output for a components analysis consists of:
The output for a canonical correlation analysis will consist of:
All output is clearly labelled and the variable names are used to identify quantities associated with variables. It is believed, therefore, that detailed description of the output is unnecessary.
The regression within groups program fits straight lines to each of a number of groups of data and tests for consistency between the fitted lines. Only one analysis is possible so that there is no instruction section in the data presentation. Only two variables are allowed but the dependent variable may be replicated evenly or unevenly. Since replication is allowed the number of observations in the two variables may be different so that the use of equation cards cannot be allowed in the program until the equation interpretive subroutine is rewritten. It is possible however to transform variables by selecting a transformation from a list of eight permitted transformations. It is an easy task to include further transformations if the present list is inadequate.
The specification section consists of cards containing the values of parameters necessary to the input of the data such as the number of points in each group of data, the names of the two variables, the number of replicates in each group of data, and the variable transformations required. Each parameter, or set of parameters, is introduced by a word, for example POINTS 10 20 30, or NAMES DOSE SFRACT. These words may appear in any order and may be punched in any columns on the cards in this section with the only restriction that the value of the parameter, or set of parameters, must immediately follow the introductory word on the same card. The CHECK and IGNORE facility described in section 1.3 is available in this program.
The data section can sometimes be prepared in two ways. The first arrangement is the normal and covers all situations. The second arrangement is a more compact form of presentation which is available if there is no replication and if the values of the independent variable are the same in all groups. The compact form can also be used if there is no replication and if the values of the independent variable differ from group to group in only one or two points since a value may be declared missing by the punching of the letter M instead of a genuine data value. The form of data presentation required is specified by the presence or absence from the specification section of the word COMPACT. The normal presentation consists of several groups of data prepared in the same way each introduced by a GROUP card. The compact presentation consists of the data for all groups punched as one block of data.
All cards in the data presentation are punched according to the card preparation rules given in section 1.6.
The following example presentation will form the starting point for the detailed description of the presentation rules.
CUSTOMER B. E. COOPER ATLAS COMPUTER LABORATORY TITLE EXAMPLE 1 NORMAL DATA SECTION POINT 4 3 GROUPS 2 YTRANS 1 REPLICATES 4 2 NAMES DOSE SFRACT DVALUE GROUP 1 X11 Y111 Y112 Y113 Y114 | X12 Y121 Y122 Y123 Y124 | Group 1 containing X13 Y131 Y132 Y133 Y134 | 4 points and 4 replicates X14 Y141 Y142 Y143 Y144 | GROUP 2 X21 Y211 Y212 | X22 Y221 Y222 | Group 2 containing X23 Y231 Y232 | 3 points and 2 replicates TITLE EXAMPLE 2 COMPACT DATA SECTION GROUPS 3 POINTS 5 YTRANS 1 NAMES DOSE SFRACT DVALUE COMPACT DATA X1 Y111 Y211 Y311 | X2 Y121 Y221 Y321 | Same 4 x values for X3 Y131 Y231 Y331 | all 3 groups and 4 y values X4 Y141 Y241 Y341 | for each of the 3 groups FINISH
The parameters that are specified in this section are introduced by words as described in the following list. The use of some of these words is optional and their omission implies a standard value(s) for the parameter(s).
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
GROUPS | One number | This word must be presented. Introduces the number of groups. |
POINTS | One number, or a list of numbers | This word must be presented. If one number is presented this implies that all groups have this number of points. If a list of numbers is given there must be one number for each group. |
REPLICATES REPS |
One number, or a list of numbers, or the word UNEVEN | If this word (or heading REPEATS below) is not presented it is assumed that there is no replication. If one number is presented it is assumed that all points in all groups have this number of replicates. If a list of numbers is presented there must be one number for each group and all points in the ith group are assumed to have the number of replicates given by the ith number in the list. If the word UNEVEN is presented it is assumed that the number of replicates varies from point to point. |
REPEATS | One number, or a list of numbers, or the word UNEVEN | This word introduces the same information as the word REPLICATES but the subsequent analysis differs (see section 1.8) according to which word was used. |
NAMES | Two variable names | This word introduces the name of tne independent and dependent variables. If this heading is not used the names XVALUE and YVALUE are used respectively. |
XTRANS | One number | Optional word used to introduce the number of the transformation to be applied to the independent variable. (See section 3.3.1 for transformation list). No transformation is assumed if this heading is omitted. |
YTRANS | One number | Optional word used to introduce the number of the transformation to be applied to the independent variable. (See section 3.3.1 for transformation list). No transformation is assumed if this heading is omitted. |
COMPACT | No items following | The presence of this word selects the compact form of data presentation. |
CHECK | Alternate numbers and words | A number specifies the position on the data cards of an item to be checked, the following word specifies the check to be made. |
IGNORE | A list of numbers | Each number specifies the position on the data cards of an item to be ignored. See section 1.3. |
PROBABILITY | One number | Optional word introducing the significance level to be taken in tests of significance. The value 0.05 is assumed if this heading is not used. |
LIMITS | One number | Optional word introducing the confidence probability to be used in the computation of confidence limits. The value 0.95 is assumed if this heading is not used. This word anticipates a feature to be included later. In its present form the program can only supply 95% confidence limits. |
DVALUE or D | No items following | The presence of this word selects additional print out of the D-value and its confidence limits. The D-value is the reciprocal of the slope. |
GRADUATE | No items following | The presence of this word selects the additional print out of a graduation of each of the fitted lines at the original data points together with residuals, standard errors and confidence limits. |
RESIDUALS | No items following | The presence of this word selects the additional print out of a graduation of the fitted combined line at the data points for each group. |
The following words may be used in any position in the specification section to make the specification more like English. They have no information content for the program and may be used more than once if required.
VALUE DATA AND
The words in the list above may be punched in any columns and in any order on any number of cards. The information introduced by a word must, however, be punched to follow that word on the same card. That is, for example the variable names introduced by the word NAMES must follow the word NAMES on the same card.
The words XTRANS and YTRANS introduce the number of the transformation to be applied to the independent and dependent variables respectively. There are eight transformations, apart from no transformation, included in subroutine TRAN2 which performs these operations. These are selected by the numbers 0 to 8 as follows:
0 | No transformation | |
1 | Log10(x) | Log |
2 | x½ | Square Root |
3 | ARCSIN(x½) | Angular transformations for proportions |
4 | ARCSIN((x/100)½) | Angular transformations for percentages |
5 | 1/x | Reciprocal |
6 | x⅓ | Cube root |
7 | Sin(x) | Sine |
8 | ex | Exponential |
Additions are easily made to subroutine TRAN2 to allow further transformations selected by the numbers 9, 10 etc. Comments are produced on the output that a variable has been transformed and these are produced by a subroutine OUTRAN so that if additions are made to subroutine TRAN2 corresponding additions must be made to subroutine OUTRAN. There are tests in both these subroutines that the transformation number does not exceed 8. These tests must be updated if further transformations are added.
If the word COMPACT is included in the specification section the compact form of data presentation is selected otherwise the normal form of data presentation is assumed.
The normal data section consists of several groups of data each introduced by a GROUP card consisting of the word GROUP followed by a number which will be used to identify that group. Each card normally contains the data for one value of the independent variable followed by all replicate (or repeat) values of the dependent variable. It is, however, acceptable to punch on to one physical card the observations for more than one point so that each of the following arrangements are acceptable.
Arrangement 1 X11 Y111 Y112 Y113 Y114 X12 Y121 Y122 Y123 Y124 X13 Y131 Y132 Y133 Y134 X14 Y141 Y142 Y143 Y144 Arrangement 2 X11 Y111 Y112 Y113 Y114 X12 Y121 Y122 Y123 Y124 X13 Y131 Y132 Y133 Y134 X14 Y141 Y142 Y143 Y144 Arrangement 3 X11 Y111 Y112 Y113 Y114 X12 Y121 Y122 Y123 Y124 X13 Y131 Y132 Y133 Y134 X14 Y141 Y142 Y143 Y144
Arrangement 3 above is acceptable but is not advised if the CHECK and IGNORE facilities are to be used. CHECK and IGNORE positions apply to positions on the physical card and position 10 for example is defined on card 1 but not on card 2. The statement IGNORE 10 causes the tenth item to be ignored if there is a tenth item so that if an item near the end of the card is to be ignored care must be exercised to ensure that it is in the same position (tenth) on each card. Checks to be made on the data apply to one block at a time. The statement CHECK 4 ASCENDING specifies that the fourth item on each card within one group is expected to form an ascending sequence and not that the fourth item on each card for all blocks is expected to form an ascending sequence. IGNORE statements refer to all groups of data.
The uneven replication is specified by the statement REPLICATES UNEVEN included in the specification section. The presentation of unevenly replicated points is similar to the presentation in the equal case but data for only one point may be included on one physical card because the number of replicates for each point is computed from the number of items on each card. Each card thus consists of the value of the independent variable followed by all the corresponding dependent variable values. The following example will illustrate this arrangement.
Arrangement 1 X11 Y111 Y112 Y113 Y114 X12 Y121 X13 Y131 Y132 Y133 X14 Y141 Y142
The CHECK and IGNORE facilities are available for use with this form of data presentation and CHECK and IGNORE positions apply to positions on the physical card. Position 4 for example is defined on card 1 but not on card 2. The CHECK and IGNORE statements cause items to be checked and ignored if they exist on the card so that if an item near the end of the card is to be ignored care must be exercised to ensure that the item is in the same position (e.g. 4th) on each card. Items punched after the data in this form of presentation cannot be ignored since they would occupy different positions on each card. CHECK statements apply to one block of data at a time.
The compact form of presentation is available when there is no replication of the dependent variable and when the values of the independent variable are the same in all groups or when the sets of values of the independent variable differ in only one or two points from group to group. In the compact form there is only one block of cards presented and all groups are included in this block. The group card normally introducing a group of data may be omitted if only one group is to be presented. Each card contains firstly the value of the independent variable followed by the values of the dependent variable one from each group. If no measurement was made of the dependent variable in one of the groups the letter M may be punched on the card in the position that the data value would have occupied. The letter M is taken to denote a missing value. The data presentation is then:
X1 Y111 Y211 Y311 X2 Y121 Y221 Y321 X3 Y131 Y231 Y331 X4 Y141 Y241 Y341 X5 Y151 Y251 Y351
This presentation looks to be the same as the even version of the normal presentation. It is worth pausing here to consider the difference between the two presentations. The first column in both presentations contains the list of values of the independent variable. The remaining columns in the normal presentation contain the replicated values for one group whereas in the compact presentation they contain the single values for each of the several groups.
A straight line is fitted, by the method of least squares, to the data for each group. An analysis of variance on the adequacy of each line is performed and its form depends on the type of data. If the data is replicated the analysis of variance table contains three sources of variation namely:
If source 1 is significantly greater than source 3 the slope is significantly greater than zero. If source 2 is significantly greater than source 3 the straight line is regarded as being an inadequate description of the data.
If the data is not replicated the third source of variation listed above is not available. The second source becomes the residual and we lose the test of adequacy of fit.
A third situation occurs when the data is repeated rather than replicated. A discussion of the difference between repeats and replicates is given in section 1.8. The program recognises this third situation if the word REPEAT is used instead of the word REPLICATES (or REPS) in the specification section. The action taken by the program consists of deleting the third source of variation from the analysis of variance so that this becomes the same as if only one value of the dependent variable had been collected for each value of the independent variable.
After all groups have been analysed a combined analysis is performed in which the sources of variability are:
The same comments made about replicated, not replicated, and repeated data apply to the combined analysis of variance. The fifth source of variability is deleted if the measurements are repeated so that this source is only present if the measurements are replicated, the fourth source must be used as residual if the fifth source is absent. The following table gives the interpretation to be made of significant sources of variation:
Source Interpretation 1. Slope of combined line is not zero. 2. Slopes of individual group lines are different. 3. Means of groups are different. 4. The straight lines are not adequate descriptions of the data.
The output for each group of data contains:
The output for the combined analysis contains:
It is believed that the output is sufficiently clearly labelled for a detailed description of the output to be unnecessary.
This program performs the usual fixed-effect model analysis of variance of a complete factorial experiment. Several variables may be presented and equation cards may be used to redefine variables or to define new variables. Any number of analyses of subsets of the data, such as an analysis omitting certain factor levels or all but one level of a factor, may be selected. All analyses are univariate even though several variables may be presented. Several variables are allowed so that new variables may be defined as functions of more than one measured variable.
The specification section consists of cards containing the values of parameters necessary to the input of the data such as the number of replicates, the number of variables and the variable names. Each parameter, or set of parameters, is introduced by a word, for example REPLICATES 2 or VARIABLES 3. These words may appear in any order and may be punched in any columns on the card in this section with the only restriction that the value of the parameter, or set of parameters, must immediately follow the introductory word on the same card. The CHECK and IGNORE facilities described in section 1.3 are available in this program and equation cards consistent with the rules given in section 1.7 may be included.
The data section may be prepared in two different arrangements. In the first arrangement the data for all variables is presented together as one block of data. In the second arrangement the data for each variable is presented separately as distinct blocks of data. In both arrangements the data is assumed to be presented in a standard order unless it is specified in the specification section that the factor levels are punched on the data cards with the observations.
The instruction section consists of any number of instructions requesting analyses to be performed on the data or on parts of the data. Words are used to request analyses so that the instructions are given in a language which is very close to normal English. Equation cards are also allowed in the instruction section.
All cards are punched according to the card preparation rules given in section 1.6. Equation cards conform to the additional rules given in section 1.7.
The following example presentation will form the starting point for the detailed description of the presentation rules.
CUSTOMER B. E. COOPER ATLAS COMPUTER LABORATORY TITLE EXAMPLE 1 COMPACT PRESENTATION IN STANDARD ORDER DIMENSIONS 6 DOSES 4 CHEMICALS 2 EXPERIMENTS VARIABLES 3 NAMES A B C CHECK 4 ASCENDING IGNORE 4 38.2 29.31 147.1 1 | 34.2 28.39 158.9 2 | 48 Data Cards presented in 39.8 21.37 172.1 3 | standard order ...................... | 29.2 21.37 138.2 48 | D=A/B ANALYSE EACH VARIABLE EXCEPT B OMITTING CHEMICAL 3 ANALYSE A AND B FOR EACH EXPERIMENT ANALYSE A FOR CHEMICALS 1 AND 2 AND DOSES 3 AND 4 TITLE EXAMPLE 2 PRESENTATION BY VARIABLES IN RANDOM ORDER DIMENSIONS 3 DOSES 2 CHEMICALS REPLICATES 4 VARIABLES 2 NAMES A AND B INTEGERS 1 AND 2 DATA A 1 1 28.3 28.7 29.4 29.1 | Data for variable A 2 1 26.8 27.3 27.3 28.1 | Factor levels specified but 3 1 24.8 26.1 25.2 25.3 | standard order actually presented. 1 2 30.1 30.8 31.1 29.6 | 2 2 27.4 28.8 28.2 28.7 | 3 2 25.3 27.1 26.2 26.3 | DATA B 1 2 30.7 31.2 29.8 30.4 | 3 2 25.8 26.1 26.2 27.9 | Data for variable B 1 1 28.8 28.4 29.3 29.0 | Factor levels specified and 3 1 26.0 24.3 25.0 25.5 | a non-standard order 2 2 28.1 27.2 28.3 28.9 | presented 2 1 26.8 28.3 27.5 27.8 | ANALYSE EACH VARIABLE FINISH
The parameters that are specified in this section are introduced by words as described in the following list. The use of some of these words is optional and their omission implies a standard value(s) for the parameter(s).
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
DIMENSIONS | Alternate integers and words | There must be the same number of integers as words. The number of integers gives the number of factors. The integers give the number of factor levels and the words the factor names. For example the DIMENSION statement in example 1 above specifies that there are three factors DOSES, CHEMICALS and EXPERIMENTS and that these have 6, 4 and 2 levels respectively. |
REPLICATES REPS |
One integer | Either of these words introduce the number of replicates. If neither of these words are used the standard value of one replicate is assumed. See REPEATS below and section 1.8. |
REPEATS | One integer | This word introduces the number of repeats and is used if appropriate instead of the word REPLICATES. See REPLICATES above and section 1.8. |
NAMES | A list of words | This word introduces the list of variable names. The number of names is taken as the number of variables. An alternative method of naming the variables is described in section 4.4.3. |
VARIABLES | One integer | This word introduces the number of variables and is used if the .NAMES statement is not used. See section 4.4.3. |
CHECK | Alternate integers and words | Each integer specifies the position on the data cards of an item to be checked, the following word specifies the check to be made. See section 1.3. |
IGNORE | A list of integers | Each integer specifies the position on the data cards of an item to be ignored. See section 1.3. |
INTEGERS | A list of numbers | The use of this word specifies that the levels of each factor are recorded on each data card and the ith integer in the list of integers specifies the position on the card of the level for the ith factor. There must be, therefore, one integer for each factor. For example, the INTEGERS statement in example 2 above specifies that the levels of the two factors are recorded as the first and second items respectively on each data card. Omission of this word implies that the data is presented in standard order as in example 1. |
COMPACT | No item | The presence of this word selects the compact form of presentation (sections 4.4.1 and 4.4.2) in which all variables are prepared together in the same block of data as in example 1. The absence of this word selects the non-compact form of presentation (sections 4.4.3 and 4.4.4) in which each variable is presented as a separate block of data. |
The following words may be used in any position in the specification section to make the specification more like English. They have no information content for the program and may be used more than once if required.
DATA PRESENTATION AND
The introductory words in the above list may be punched in any columns and in any order on any number of cards. The information introduced by a word must, however, be punched to follow that word on the same card. That is, for example the variable names introduced by the word NAMES must follow the word NAMES on the same card.
Equation cards prepared according to the rules given in section 1.7 may appear either before or after, or even mixed with the specification cards. No additional information may appear on an equation card. That is parameters such as VARIABLES 3 must not be punched on the end of an equation card.
Two arrangements of the data are allowed. The presence of the word COMPACT in the specification section selects the compact presentation in which all variables are presented together as one block of data. The absence of the word COMPACT selects the non-compact presentation in which each variable is presented as a separate block of data. If items expected by the program to be on one card cannot be accommodated on one card the continuation character $ may be used to continue items on to the next card (or cards). The character $ is punched at the end of a card to be continued (see section 1.9).
Each card normally contains all observations for all variables for one combination of factor levels. The replicate (or repeat) observations are punched first, followed by the observations for the second variable, followed by those for the third variable, and so on. It is possible to include observations for all variables for more than one combination of factor levels and this arrangement is described in the next section. The separate factor level combinations (separate cards) may be presented in random order if integers specifying the factor levels are included on the cards. The INTEGERS statement is used in the specification section to declare the positions on each data card occupied by these factor levels. If the factor levels are not specified on each card the factor level combinations (cards) must be presented in standard order. In standard order the most rapidly changing factor is the first factor. That is the first factor passes through its levels first, followed by the second, and so on. Standard order for case 1 in the example presentation given above is as follows:
Card | Levels | ||
---|---|---|---|
Doses | Chemicals | Experiments | |
1 | 1 | 1 | 1 |
2 | 2 | 1 | 1 |
3 | 3 | 1 | 1 |
4 | 4 | 1 | 1 |
5 | 5 | 1 | 1 |
6 | 6 | 1 | 1 |
7 | 1 | 1 | 1 |
8 | 2 | 2 | 1 |
9 | 3 | 2 | 1 |
10 | 4 | 2 | 1 |
11 | 5 | 2 | 1 |
12 | 6 | 2 | 1 |
13 | 1 | 3 | 1 |
... | ... | ... | ... |
24 | 6 | 4 | 1 |
25 | 1 | 1 | 2 |
36 | 6 | 2 | 2 |
37 | 1 | 3 | 2 |
38 | 2 | 3 | 2 |
39 | 3 | 3 | 2 |
40 | 4 | 3 | 2 |
41 | 5 | 3 | 2 |
42 | 6 | 3 | 2 |
43 | 1 | 4 | 2 |
44 | 2 | 4 | 2 |
45 | 3 | 4 | 2 |
46 | 4 | 4 | 2 |
47 | 5 | 4 | 2 |
48 | 6 | 4 | 2 |
The CHECK and IGNORE facility as described in section 1.3 is available for this section. It is not necessary to include the factor level integer positions in an IGNORE statement. The statement INTEGERS 4 5 6 implies also IGNORE 4 5 6.
It is possible within the COMPACT data section to include observations for all variables for more than one factor levels combination on one card. The combinations included on one card must be consecutive in the standard order and be punched in the correct order. The cards may be presented in random order if the factor levels for the first combination on the card are included otherwise standard order must be used. If the factor levels for the second and subsequent combinations presented on one card are punched an IGNORE statement must be given to ignore these integers. A possible presentation for the first example given above and including the factor levels would be:
TITLE EXAMPLE 1 MORE COMPACT PRES.ENTATION WITH FACTOR LEVELS DIMENSIONS 6 DOSES 4 CHEMICALS 2 EXPERIMENTS VARIABLES 3 NAMES A B C INTEGERS 1 2 3 IGNORE 10 1 1 1 38.2 29.31 147.1 34.2 28.39 15a.9 1 3 1 1 39.8 28.27 172.1 41.2 27.88 151.3 3 5 1 1 40.8 29.37 151.2 37.2 20.97 146.2 5 1 2 1 41.2 28.49 162.3 30.4 29.13 167.1 7 --------------------------------------------------------------- 5 4 2 37.3 27.14 163.9 39.2 26.37 158.2 47
Although the factor levels have been included in the above example the cards themselves have been placed in standard order.
The CHECK and IGNORE facility as described in section 1.3 is available for this section. As in the first arrangement it is not necessary to include the factor level integer positions for the first combination on each card in an IGNORE statement.
Each variable is presented as a separate block of data and is preceded by a card containing the word DATA followed by the variable name. The NAMES statement may be omitted from the specification if the data is presented in the non-compact form. The data is prepared with all replicate (or repeat) observations for one factor level combination on one card. It is possible to include the data for more than one factor level combination on one card as described in the next section. The separate factor level combinations (separate card) may be presented in random order if integers specifying the factor levels are included on the cards. The INTEGERS statement is used in the specification section to declare the positions on each data card occupied by these factor levels. If the factor levels are not specified on each card the factor level combinations (cards) must be presented in standard order. Standard order is described with an example in section 4.4.1.
The CHECK and IGNORE facility as described in section 1.3 is available for this section. It is not necessary to include the factor level integer position in an IGNORE statement. The statement INTEGERS 4 5 6 implies also IGNORE 4 5 6.
It is possible within the non-compact data section to include observations for more than one factor levels combination on one card. The combinations included on one card must be consecutive in the standard order and be punched in the correct order. The cards may be presented in random order if the factor levels for the first combination on the card are included otherwise standard order must be used. If the factor levels for the second and subsequent combinations presented on one card are punched an IGNORE statement must be given to ignore these integers. A possible presentation for the second example given above, without factor levels, would be:
TITLE EXAMPLE 2 PRESENTATION BY VARIABLES IN STANDARD ORDER DIMENSIONS 3 DOSES 2 CHEMICALS REPLICATES 4 VARIABLES 2 CHECK 1 ASCENDING IGNORE 1 AND 2 DATA A 1 A 28.3 28.7 29.4 29.1 26.8 27.3 27.3 28.1 24.8 26.1 25.2 25.3 2 A 30.1 30.8 31.1 29.6 27.4 28.8 28.2 28.7 25.3 27.1 26.2 26.3 DATA B 1 B 28.8 28.4 29.3 29.0 26.8 28.3 27.5 27.8 26.O 24.3 25.0 25.5 2 B 30.7 31.2 29.0 30.4 20.1 27.2 20.3 28.9 25.8 26.1 26.2 26.9
The CHECK and IGNORE facility as described in section 1.3 is available for this section. As in the first arrangement it is not necessary to include the factor level integer positions for the first combination on each card in an IGNORE statement.
Each instruction is punched on a separate card but if one card is not sufficient to contain an instruction it may be continued onto a second (or more) card by the use of the continuation character $. The character $ is punched at the end of the card (or cards) to be continued. Instructions may be made up of introductory words, listed below, and the information that these words introduce. Examples of the use of these words in instructions is given after this list.
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
ANALYSE | 1) A list of variable names, or 2) The words EACH VARIABLE. | This word introduces either variable names, or a list of names of variables to be analysed or the words EACH VARIABLE which implies that all variables are to be analysed. |
EXCEPT | A list of variable names. | This word is used with the statement ANALYSE EACH VARIABLE; to introduce a list of names of variables that are not to be analysed. |
FOR | 1) A list of factor names and levels. | This word introduces the factor levels for which separate analyses are required. The levels are introduced by the factor name and more than one factor name followed by its levels may be listed. If a factor is not listed in a FOR statement all levels are included. The statement FOR CHEMICALS 1 2 DOSES 3 4 given in example 1 in section 4.2 is an example of this type of FOR statement. See section 4.5.2. |
FOR | 2) The word EACH followed by factor names. | This word together with the word EACH introduces factors for which separate analyses for each combination of levels are required. For example if two factors, having 6 and 4 levels respectively, are listed 24 separate analyses including all other factors are produced. The statement FOR EACH EXPERIMENT given in example 1 in section 4.2 is an example of this type of FOR statement. See section 4.5.2. |
FOR | 3) Mixture of the two lists described above | This word may also introduce a mixture of types 1 and 2 above. For example, the statement FOR EACH EXPERIMENT AND DOSES 1 2 3 is legal. See section 4.5.2. |
OMITTING or OMIT | A list of factor names and levels. | This word introduces the factor levels to be omitted from the analysis or analyses. This statement may be used with any of the FOR statements described above but the joint use must be carefully considered. See section 4.5.2. |
PARTITIONS (Available later) | Alternate factor names and integers. | This word introduces the extent of polynomial partitioning required. The factors listed are those for which partitioning is required and the integers following each factor name specify the highest order partition required for that factor. Factors not listed in a PARTITIONS statement are not partitioned. This facility is not included in the first version of the program but will be included later. See section 4.5.3. |
The following words may be used in instructions in any position to make the instructions more like English~ They have no information content for the program and may be used more than once if required.
POLYNOMIAL VARIABLE FACTOR LEVEL LEVELS AND AT
Analysis of complete variables are specified by using the ANALYSE statement, and the EXCEPT statement if required. Example instructions are
ANALYSE EACH VARIABLE ANALYSE VARIABLES A B AND C ANALYSE EACH VARIABLE EXCEPT C
Each of these statements may be qualified as described in the next two sections to specify analyses on subsets of the data for each variable.
The FOR statement described above may be used to specify separate analyses for various combinations of factor levels. If data for a five factor experiment is presented, for example, the FOR statement may be used to specify:
It is possible to produce analyses for each combination of each level of one factor with selected levels of a second factor. This is the third type of FOR statement described above.
The OMIT (or OMITTING) statement is used to delete particular factor levels from analyses. An OMIT statement used without a FOR statement will specify analyses involving the same number of factors as the original data unless all levels except one of a factor, or factors, are listed. The omission of all levels of a factor is clearly nonsense. The OMIT and FOR statements may be used together but factors listed in the FOR statement must not be included in the OMIT statement also. That is the statement ANALYSE A FOR EACH SEX AND CHEMICAL OMITTING CHEMICAL 3 is not legal. If the factor CHEMICAL has four levels the correct instruction is ANALYSE A FOR EACH SEX AND CHEMICALS 1, 2 AND 4 We may summarise the function of the first four words by saying that the variables to be analysed are specified by using the statements ANALYSE and EXCEPT, factor levels to be omitted from analyses are specified by using the statement OMIT (or OMITTING), and factor level combinations for which separate analyses are required are specified by using the statement FOR.
Examples of instructions using the first four words are:
ANALYSE EACH VARIABLE EXCEPT C OMITTING DOSES 3 AND 4 ANALYSE A B AND C FOR EACH SEX AND CHEMICAL OMIT DOSES 2 3 ANALYSE EACH VARIABLE FOR EACH SEX AND CHEMICALS 1 2 3 AND 4 ANALYSE A FOR SEX 1 AND EXPERIMENTS 1 2 AND 3 OMIT CHEMICAL 1
It is hoped to include facilities for polynomial partitioning in the program in the near future. The facilities planned for inclusion in the next version of the program allow the specification of the factors to be partitioned and the maximum degree of partitioning for each factor. This information will be introduced by the word PARTITION followed by a list of alternate words and numbers; the words being the names of the factors to be partitioned, and the numbers being the maximum degree of partitioning. It will be assumed that the levels of factors to be partitioned are EQUALLY SPACED although it is hoped to relax this condition eventually. If, for example, factors CHEMICALS and DOSES have 6 and 4 levels respectively the statement PARTITIONS CHEMICALS 3 DOSES will cause the linear, the quadratic, and the cubic partitions for CHEMICALS and the linear effect for DOSES to be computed. The remaining partitions for each factor will be included together in sums of squares labelled CHEMICALS REMAINDER and DOSES REMAINDER. The interactions between partitions are computed as well as the partitions themselves so that the sums of squares (as far as CHEMICALS and DOSES are concerned) produced by the above statement would be:
SOURCE D.F. CHEMICALS LINEAR 1 CHEMICALS QUADRATIC 1 CHEMICALS CUBIC 1 CHEMICALS REMAINDER 2 DOSES LINEAR 1 DOSES REMAINDER 2 CHEMICALS LINEAR x DOSES LINEAR 1 CHEMICALS QUADRATIC x DOSES LINEAR 1 CHEMICALS CUBIC x DOSES LINEAR 1 CHEMICALS REMAINDER x DOSES LINEAR 2 CHEMICALS LINEAR x DOSES REMAINDER 2 CHEMICALS QUADRATIC x DOSES REMAINDER 2 CHEMICALS CUBIC x DOSES REMAINDER 2 CHEMICALS REMAINDER x DOSES REMAINDER 4
A PARTITION statement may be added to any of the statements described so far, for example:
ANALYSE EACH VARIABLE EXCEPT C PARTITION DOSES 3 ANALYSE A B AND C FOR EACH SEX AND CHEMICAL OMIT DOSES 2 3 PARTITION CHEMICALS 3
The use of the PARTITION statement with an OMIT statement needs careful consideration because of the assumption that the levels of factors to be partitioned are equally spaced. If levels are to be omitted from a factor for which partitioning is required the program makes the assumption that the remaining levels are equally spaced. Thus if the original levels are equally spaced, the levels remaining after an OMIT statement may no longer be equally spaced.
It is possible of course, to partition sums of squares for factors with unequally spaced levels and it is hoped that the polynomial partitioning facility will be extended in this program to include unequally spaced levels. The necessary information for this to be done is the actual factor levels and this could be given in the specification section as a statement of the form FACTOR LEVELS DOSES 1.0 2.5 5.0 10.0. Details of polynomial partitioning will be published as new facilities become available.
Each word used in instructions is truncated, if necessary, by the program to eight characters (six characters on the 7090) and factor names are correctly identified by the program if the first eight (six on the 7090) letters are correct. The English nature of the instructions may suggest in some contexts that the plural form of the factor name should be used whilst in other contexts the singular form may seem appropriate. It is important, however, for factors with names, in the singular form, of less than eight characters that the same form of the factor name is used in the specification section and in the instruction section. It is suggested that this form should be the plural form as naturally used in the DIMENSIONS statement in the specification section. This will lead to the unnatural English in instructions when one level is referred to or when the word EACH is used. The correct reference to one sex in the examples used above is either
DIMENSIONS 2 SEXES - - - - followed by ANALYSE SEXES 1 - - - - or, alternatively DIMENSIONS 2 SEX - - - - followed by ANALYSE SEX 1
Factors with names, in the singular form, of more than eight letters are not affected and the plural and singular forms may be used interchangeably. The factor CHFMICALS used above is an example of such a factor.
It is hoped that the identification process in the program will be extended to make correct identification of factor names irrespective of the form of the name used. The simple addition of an S (or possibly also ES) to the singular form can be allowed eventually without much difficulty. Other plural forms, of course, exist and are more difficult to allow. The extension of the identification process, when programmed, will be confined to the S and ES plural forms.
The output contains, initially, all main effect and interaction means for each variable. Each set of means is clearly labelled with the name of the variable and the name, or names, of the factors involved. Analyses of variance with F-ratios and probabilities are then output for each analysis selected in the instruction section. The sums of squares are labelled with the appropriate factor names and each analysis is identified by clear headings.
This program performs the analyses of diallel table data developed by Hayman (3) and Jinks (4). Several variables may be presented and equation cards may be used to redefine variables or to define fresh variables. The program is capable of fitting a straight line between any two variables and setting-up the residuals as a fresh variable which may be referred to in subsequent equation cards or analysed as a normal variable. Two forms of replication are allowed and these are referred to as normal replicates or genetic replicates. Normal replicates simply supply a residual sum of squares to the Hayman analysis of variance whereas genetic replicates interact with the other effects in the analysis.
The specification section consists of cards containing the values of parameters necessary to the input of the data such as the number of parents, the number of variables and the names of the variables. Each parameter, or set of parameters, is introduced by a word, for example, PARENTS 8, VARIABLES 3, and NAMES A B C. These words may appear in any order and may be punched in any columns on the cards in this section with the only restriction that the value of the parameter or set of parameters must immediately follow the introductory word on the same card. The CHECK and IGNORE facilities described in section 1.3 are available in this program.
Only one arrangement of the data is available in this program although other arrangements may be added if found necessary. The data is presented in blocks each consisting of the values for all variables for that block. The order of presentation within a block may be a standard order or if the parental number codes are included on the data cards the presentation order may be random.
The instruction section consists of any number of instructions requesting analyses to be performed on the data or on parts of the data. Words are used to request analyses so that the instructions are given in a language which is very close to normal English. Equation cards are also allowed in the instruction section. If the instruction section is omitted altogether Hayman and Jinks analyses are performed on each variable.
All cards are punched according to the card preparation rules given in section 1.6. Equation cards conform to the additional rules given in section 1.7.
The following example presentation will form the starting point for the detailed description of the presentation rules.
CUSTOMER B. E. COOPER ATLAS COMPUTER LABORATORY TITLE EXAMPLE 1 PARENTS 4 NORMAL REPLICATES 2 GENERIC REPLICATES 2 BLOCKS 3 VARIABLES 2 NAMES CONTROL IRRAD INTEGERS 2 3 IGNORE 1 CHECK 1 IDENTICAL BLOCK 1 1 1 1 C11111 C11112 C11121 C11122 I11111 I11112 I11121 I11122 | 1 1 2 C11211 C11212 C11221 C11222 I11211 I11212 I11221 I11222 | 16 cards ------------------------------------------------------------ | 1 4 4 C14411 C14412 C14421 C14422 I14411 I14412 I14421 I14422 | BLOCK 2 2 3 2 C23211 C23212 C23221 C23222 I23211 I23212 I23221 I23222 | 2 1 2 C21211 C21212 C21221 C21222 I21211 I21212 I21221 I21222 | 16 cards ------------------------------------------------------------ | 2 4 3 C24311 C24312 C24321 C24322 I24311 I24312 I24321 I24322 | BLOCK 3 3 1 1 C31111 C31112 C31121 C31122 I31111 I31112 I31121 I31122 | 16 cards ------------------------------------------------------------ | 3 4 1 C23111 C34112 C34121 C34122 I34111 I34112 I34121 I34122 | REGRESSION OF IRRAD ON CONTROL NAME RESIDUALS IRRRES ANALYSE VARIABLES IRRRES AND IRRAD FINISH
Data values for the variable CONTROL are represented by C in the above example and values for the variable IRRAD are represented by I. The five subscripts used above to label a data value represent, in order, the following:
The data layout is explained in greater detail in section 5.4.
The parameters that are specified in this section are introduced by words as described in the following list. The use of some of these words is optional and their omission implies a standard value(s) for the parameter(s).
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
PARENTS | One integer | This word introduces the number of parents. |
NORMAL | One integer | Optional word introducing the number of normal replicates. Standard value 1 is assumed if not used. |
GENETIC | One integer | Optional word introducing the number of genetic replicates. Standard value 1 assumed if not used. |
BLOCKS | One integer | Optional word introducing the number of blocks. Standard value 1 assumed if not used. |
VARIABLES | One integer | Optional word introducing the number of variables. Standard value 1 assumed if not used. |
NAMES | List of words | Optional word introducing the variable names. A blank name is assumed if not used. Use of this word is strongly advised if more than one variable is presented. |
HALF | No information following | The presence of this word specifies that the data for half diallels is to be presented. |
INTEGERS | Two integers | Optional word introducing the positions on the data cards occupied by the parent identification numbers. The first position is that of the female identification number. If this word is not used it is assumed that no identification integers are present and that all data is presented in standard order. This heading must be used for the introduction of half-diallel data. See section 5.4. |
CHECK | Alternate numbers and words | A number specifies the position on the data cards of an item to be checked, the following word specifies the check to be made. See section 1.3. |
IGNORE | A list of integers | Each integer specifies the position on the data cards of an item to be ignored. See section 1.3. |
HAYMAN | No information following | The presence of this word selects Hayman's analysis of the data. |
JINKS | No information following | The presence of this word selects Jinks analyses of the data. These words are used in the specification section when no instructions are given in the instruction section. The absence of both words implies that both analyses are to be performed. |
SELFS | One number for each variable | This word introduces the variance of observations from like parent matings. |
CROSSES | One number for each variable | This word introduces the variance of observations from matings between unlike parents. |
REPEATS | No information following | This heading is used with the heading NORMAL (e.g. NORMAL REPEATS 2 instead of NORMAL REPLICATES 2) if the several observations for each parental combination are repeats rather than replicates. A discussion of the difference between repeats and replicates is given in section 1.8. |
The following words may be used in any position in the specification section to make the specification more like English. They have no information content for the program and may be used more than once if required.
AND WITH DIALLEL TABLE REPLICATES FOR ANALYSIS ANALYSES VARIANCE
The introductory words in the above list may be punched in any columns and in any order on any number of cards. The information introduced by a word must, however, be punched to follow that word on the same card. That is, for example, the variable names introduced by the word NAMES must follow the word NAMES on the same card.
Equation cards punched according to the rules given in section 1.7 may be included with this section to redefine variables or to define new variables.
The data section consists of a number of separate blocks of data each introduced by a block card consisting of the word BLOCK followed by an integer identifying the block of data. Each card of each block normally contains all values for all variables for one parental combination. Two integers specifying the parental combination may be punched on each card and the positions on a data card occupied by these two integers are declared in the specification section by using the heading INTEGERS. If these integers are included the cards within a block may be presented in random order, but if these integers are omitted the cards are assumed to be in standard order. Standard order is illustrated in the following example of the presentation of a 4 × 4 table.
Card | Female | Male |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 4 | 1 |
5 | 1 | 2 |
6 | 2 | 2 |
7 | 3 | 2 |
8 | 4 | 2 |
9 | 1 | 3 |
10 | 2 | 3 |
11 | 3 | 3 |
12 | 4 | 3 |
13 | 1 | 4 |
14 | 2 | 4 |
15 | 3 | 4 |
16 | 4 | 4 |
The data for one parental combination may be classified according to normal replicates, genetic replicates and variables. Normal replicate values for the same genetic replicate and variable are punched consecutively. All normal replicates for all genetic replicates are punched in larger groups so that all values for variable 1 are followed by all values of variable 2 and so on. The order is illustrated in the following diagram for an example consisting of 2 normal replicates, 2 genetic replicates and 2 variables:
27.4 27.9 30.4 29.1 29.1 28.8 29.9 29.7 -------- --------- --------- --------- 2 normal 2 normal 2 normal 2 normal replicates replicates replicates replicates for genetic for genetic for genetic for genetic replicate 1 replicate 2 replicate 1 replicate 2 -------------------------- -------------------------- Variable 1 Variable 2
The CHECK and IGNORE facility as described in section 1.3 is available for this section and checks are applied to each block of data separately. It is not necessary to include the parental combination integer positions in an IGNORE statement. The statement INTEGERS 1 AND 2 implies also IGNORE 1 AND 2.
It is possible to present all the data for two or more parental combinations on one card provided that the parental combination order on the card is consistent with the standard order. For example if the data for two parental combinations are to be included on each card in a 4 × 4 presentation the eight cards would contain the following parental combinations.
Card | Combination 1 | Combination 2 | ||
---|---|---|---|---|
Female | Male | Female | Male | |
1 | 1 | 1 | 2 | 1 |
2 | 3 | 1 | 4 | 1 |
3 | 1 | 2 | 2 | 2 |
4 | 3 | 2 | 4 | 2 |
5 | 1 | 3 | 2 | 3 |
6 | 3 | 3 | 4 | 3 |
7 | 1 | 4 | 2 | 4 |
8 | 3 | 4 | 4 | 4 |
If integers specifying the combination are to be included on the card only those relevant to the first combination are necessary. If integers for the remaining combinations are punched these positions must be included in an IGNORE statement.
If there is no replication present and only one variable the data may, therefore, be presented with one row per card so that the print out of such data would be in the normal diallel table form. If more than one variable is present one row of each variable could be presented on each card and the print out would show several tables side by side.
If no instructions are given in this section analyses (HAYMAN or JINKS) as specified in the specification section are performed on all variables.
Each instruction is punched on a separate card but if one card is not sufficient the instruction may be continued onto a second (or more) card by use of the continuation character $. The character $ is punched onto the end of the card (or cards) to be continued and not on the continuation card unless this is also to be continued. Two types of instruction, specifying regression analyses and diallel table analyses respectively, are allowed. Instructions are made up of introductory words, listed below, and the information that these words introduce. Examples of the use of these words in instructions is given after this list.
Introductory Word |
Information following word |
Description of information introduced |
---|---|---|
REGRESSION | No item following | The presence of this word selects a regression analysis. |
OF | One word | This word introduces the name of the dependent variable. |
ON | One word | This word introduces the name of the independent variable. |
CALL or NAME | One word | Either of these words introduce the variable name to be given to the residuals produced by the regression analysis. |
GRADUATE | No item following | The presence of this word selects a graduation of the fitted regression line including confidence limits. |
SIGNIFICANCE | One number | This optional word introduces the significance level to be used in tests of significance. The value 0.05 is taken if this word is not used. |
ANALYSE | 1) A list of words 2) ALL VARIABLES 3)EACH VARIABLE |
This word introduces either, a list of names of the variables to be analysed, or specification using the words ALL VARIABLES or EACH VARIABLE that all variables are to be analysed. |
EXCEPT | A list of words | This optional word introduces a list of names of variables that are not to be analysed. This word is intended for use with the words ANALYSE ALL VARIABLES EXCEPT - - - |
HAYMAN | No item following | The presence of this word selects Haymans analysis of the data. |
JINKS | No item following | The presence of this word selects Jinks analysis of the data. The absence of both words implies that both analyses are to be performed. |
OMITTING or OMIT | The word BLOCKS followed by a list of integers and/or the words GENERIC REPLICATES followed by a list of integers | Either of these words introduce a list of blocks and/or a list of genetic replicates to be omitted from Hayman or Jinks analyses. This facility is not included in the program in its early form but will be made available later. |
The following words may be used in instructions in any position to make the instruction more like English. They have no information content for the program and they may be used more than once if required.
LEVEL RESIDUAL PROBABILITY LIMITS AND REPLICATES ANALYSIS ANALYSES
Instructions selecting regression analyses are made up of the first six words in the list above. Examples of these instructions are:
REGRESSION ANALYSIS OF Y ON X CALL REIDUALS YRES GRADUATE REGRESSION OF A ON B NAME RESIDUALS ARES SIGNIFICANCE LEVEL 0.01
Instructions selecting analyses other than regression analyses are made up of the last five words in the list above. Examples of these instructions are:
ANALYSE A AND B HAYMAN ANALYSIS ANALYSE ALL VARIABLES EXCEPT ARES JINKS ANALYSIS
When the OMIT (or OMITTING) facility is included in the program the following instructions will be valid:
ANALYSE A OMITTING BLOCK 1 AND GENErIC REPLICATES 1 AND 4 ANALYSE ARES OMIT GENERIC REPLICATE 3 HAYMAN ANALYSIS
The output for the two types of analyses may be considered separately:
The output for regression analyses consists of:
The output for non-regression type analyses consist of Hayman and Jinks analyses, as requested, for each of the following:
The data analysed at each stage is recorded in the output with the analyses. All output is clearly labelled and the variable names are used to identify quantities associated with variables. It is believed, therefore, that detailed description of the output is unnecessary.