Jump Over Left Menu
Solving a Combinatorial Problem encountered in Archaeology
James Doran and Susan Powell
The archaeological background
Much of an archaeologist's work involves establishing the chronological relationships between those objects made by prehistoric man which have survived until the present day. Deciding whether a particular type of pottery was current before, after, or at the same time as a particular type of bronze dagger in some region of prehistoric Europe might be an important, if a small, step on the way to the archaeologist's overall goal of an accurate reconstruction and explanation of man's past.
A particular version of this task of chronological reconstruction occurs when a number of similar groups of objects have been discovered, and where the task is to establish the chronological sequence of the groups solely by comparing their contents. Typical examples are collections of stone tools found in caves or rock shelters once occupied by palaeolithic man, and the weapons and jewellery deposited with each of perhaps hundreds of burials in an Iron Age cemetery. In the latter case, for example, we can compare what kinds and styles of objects accompany each burial and, making assumptions about technological evolution and fashion, try to obtain some idea of the order in which the burials were made.
Formulating the archaeological problem in mathematical terms
One way of expressing this archaeological problem mathematically is as follows (for a variety of alternatives see Hodson, Kendall and Tautu, 1971). Construct a (0,1)-matrix each of whose rows corresponds to a group of objects (collection of stone tools, burial goods) and each of whose columns corresponds to a particular kind of object (a flint end-scraper or a particular type of bronze brooch). Each entry of the matrix is to be a 1 if the corresponding kind of object occurred in the corresponding group of objects, 0 if not. We now ask for the permutation of the rows which yields the actual chronological sequence in which the groups of objects were established (that is, the caves occupied, or the burials made).
In order for our task to be well-defined, we make a crucial assumption based on the archaeological view that different kinds of object, by and large, tend to replace one another rather than be in use simultaneously. This belief is certainly not wholly true, but there is enough truth in it for procedures based on the following assumption to have substantial value. We assume that the chronologically best permutation of the rows of the matrix is that which minimises
where the summation is over the columns of the matrix, and where rj is the difference between the row-numbers of the first and last non-zero entries in column j. Thus we wish to minimise the total sum of the column spans of the non-zero entries. Note that we can discard any column which contains less than two entries.
An important property of this criterion is that any permutation has the same score as its inverse. Archaeologically, this means that we are content to arrange the rows of the matrix in chronological sequence without trying to decide which end of the sequence is early and which is late.
Since the matrix may well have tens or even hundreds of rows, the task of finding the permutation (or its inverse) with the lowest score is far from trivial. There turn out to be a number of different approaches to it, and in the remainder of this note we shall sketch some of them and indicate what success we have with each.
Is there a straightforward method of constructing the permutation for which the criterion score is least, an algorithm which reduces the task to a few mechanical operations? Such an algorithm has been found for the very special case in which there exists a permutation of the matrix rows such that the non-zero entries in each column are fully contiguous (Fulkerson and Gross, 1965). But in the general case, which is the one of archaeological interest, no such algorithm has been discovered and it seems unlikely that one exists.
We are therefore obliged to turn to algorithms which search for the optimal permutation by examining many possibilities. Since we obviously cannot examine each and every possible permutation of the rows for a matrix of any size, something more sophisticated is called for.
One option is to employ a direct search method (Hooke and Jeeves, 1961). For our problem this means that we proceed as follows:
- pick a permutation at random,
- examine each of a range of possible small modifications to the permutation. If one of these modifications reduces the criterion score, then go to (3) otherwise go to (4),
- implement the beneficial modification to obtain a new permutation and return to (2),
- record the current permutation and go to (1).
Thus we carry out a number of distinct searches, each starting from a different randomly selected permutation, and each yielding a permutation which is locally optimal. The answer is then the best of these locally optimal permutations.
There is no guarantee that our answer will be the overall optimal permutation. Indeed, it probably will not be. However, if we make a good choice of the range of permutation modifications and if we carry out a sufficient number of distinct searches, then we shall end up with a permutation which if not optimal will be nearly so. How many searches are sufficient depends very much on the size and structure of the problem matrix. The larger the matrix, the more searches will be needed, but only general guidelines can be given.
Using direct search it has proved possible to solve, in an archaeologically acceptable sense, matrices with more than fifty rows and columns (Doran, 1971). About twenty minutes of computer time on Atlas are needed.
What can we do if we are dissatisfied with solutions that are good but probably not the best? As we have already seen, to consider every possible permutation is out of the question. An alternative approach is provided by branch-and-bound methods (Lawler and Wood, 1966; Mitten, 1970). These guarantee to find an optimal solution in situations of this kind. They do so by using the mathematical structure of the problem to rule out large sets of possible solutions.
Given our problem we can approach it as follows. Let us call a partial permutation a permutation which has been only partially specified. Then, for example, the set of all permutations of a 5 row matrix which begin with row 3 and end with rows 1 and 4 can be represented by the partial permutation (3 . . 1 4). The rows 2 and 5 we shall call free.
We can calculate a lower bound for the criterion score over the set of complete permutations represented by a partial permutation. A simple way is to assume that for each column the non-zero entries in the free rows can be independently positioned to give the lowest score. This simplifying assumption makes the computation easy but yields only a weak bound.
We say that branching occurs when we replace a partial permutation by a list of partial permutations each of which differs from the original by the addition of one free row, at one end or the other.
Now we can write a recursive program which combines branching and bounding as follows. Throughout the process the program keeps a running record of the best score it has so far found, starting with an arbitrary high value. It proceeds by branching as if to generate the tree of all partial permutations going down each branch before proceeding to the next. However, as each partial permutation is generated the associated lower bound is calculated. If this value is greater than or equal to the current best score then there is clearly no need to continue branching. A complete permutation can only be reached if its score is less than the current best score. When this happens a new best score is obtained. After all branches of the tree have been explored we are left with the optimal solution.
With the help of S. D. Carlisle, of the University of Bath, and recently of the Atlas Computer Laboratory, we have tested the effectiveness of such a program. The results were disappointing. In marked contrast to the direct search approach, it could only cope with 12 row matrices within five minutes of Atlas computer time.
However, it might well be possible to find a more effective formulation of this problem in branch-and-bound terms. There are three ways in which substantial improvements might be made:
- by computing tighter bounds for a partial permutation,
- by finding a better branching rule, that is, by partitioning the set of all possible permutations in a different and more effective way, and
- by varying the search strategy, so that promising branches are explored sooner.
Another approach which, if successful, guarantees the optimal solution is to formulate the problem as an integer linear program. Integer linear programming optimises a function subject to linear constraints and subject to the constraints that the variables must be integer.
Let xi, i = 1, ..., m be the location in the permuted matrix of row i of the original matrix. Let tj and sj,j = 1, ... , n, be the row-numbers of the first and last non-zero entries in column j of the permuted matrix and let rj be the column span. These variables are subject to the following constraints.
The xi must be within the required range
0 ≤ xi ≤ m-1 (1) and (2) By definition tj ≥ 0 and sj ≥ 0 (3) For each non-zero entry of the original matrix, in row i and column j sj ≤ xi ≤ tj (4) Let vj be the total number of non-zero elements in column j, then rj ≥ vj - 1 (5) The column span and the first and last non-zero entries in a column are related by tj - sj = rj (6) A permutation is characterised by |xi - xI| ≥ 1 i ≠ I; i=1,...,m; I=1,...,m (7)
A solution that satisfies constraints (2) and (7) will be integer; hence it is unnecessary to state explicitly the constraint that the xi be integer. In addition, because of the structure of the constraints (3) to (6) there is no need to constrain the tj, sj, and rj to be integer.
The objective is to minimize subject to constraints (1) to (7).
It is not possible to express constraint (7) as a linear expression without the introduction of an integer variable. This difficulty can be overcome by the use of a branch-and-bound approach. Constraint (7) specifies that either xi - xI ≤ -1 or xi - xI ≥ 1 and it is possible to branch upon this dichotomy.
The problem minimise subject to constraints (1) to (6) can be solved as a linear program. If the solution obtained does not satisfy constraint (7), say xi - xI = 0, then we branch and create two new problems. Each is composed of the original problem plus either the constraint xi - xI ≥ 1 or the constraint xI - xi ≥ 1. Such a problem can itself be solved as a linear program and the value of the objective function obtained is a lower bound on the value of the objective function for the overall problem. We continue branching until either a solution is a permutation, or it is infeasible, or the value of the objective function is greater than the known best. In any of these situations we abandon the current branch and explore another. The exploration of all branches ensures that the optimal permutation is found.
The solutions to the successive linear programs in the above method may or may not be integer. But for a permutation the variables must be integer. This suggests that, after branching, instead of solving a linear program it would be possible to solve the integer program minimise subject to constraints (1) to (6), to any branching constraints present, and to the constraint
xi is integer i = 1,...,m (8)
The explicit imposition of the integrality constraint (8) has the effect of tightening the lower bound on the objective function and hence of limiting the development of the tree. However, this is at the expense of spending more time solving the subordinate integer linear programs. A few experiments have indicated that if the integer linear program is solved by a cutting plane method the increased computation is not justified by the improvement in the lower bounds.
Without counting the branching constraints or any cutting planes generated the number of constraints that have to be explicitly stored in the computer is
The lower and upper bound constraints are handled implicitly. The computer program we have available to solve a linear program (see Powell, this volume) makes no use of backing store and this obviously restricts the size of the problem that can be handled. Larger problems could be treated by rewriting the computer program to enable it to deduce the constraints from the matrix as they are needed. But of greater importance than the storage requirement is the machine time needed to solve the successive linear or integer programs. Even allowing for inefficiencies in the computer program, experiments have demonstrated that this integer programming formulation requires more time to solve a specific problem that does the straightforward branch-and-bound method described in the previous section.
The problem may also be formulated with a non-linear objective function subject only to linear constraints. The variables xi, tj, sj and rj are defined as above and are subject to constraints (1) to (6). The following additional constraints are imposed:
xi + xj ≥ 1 i ≠ j xi + xj + xk ≥ 3 i ≠ j ≠ k (9) .................. ..................
and so on, where each subscript ranges from 1 up to m. These constraints state that the sum of all possible combinations of s variables must be greater than or equal to
where s = 2, 3,...,m-1
The sum of the column spans is fixed at a preselected value
It is possible to prove that the function subject to constraints (1) to (6), (9) and (10) will be maximised at a permutation, if one exists. Thus it is possible by solving successive problems with decreasing values of z to find the permutation which mininmises .
We have exchanged the integrality and permutation constraints (8) and (7) of the integer linear programming formulation for a non-linear objective function, a formidable number of linear constraints, and a different solution strategy. Interesting though this exchange is, it is not obvious that it yields any practical benefit. We have conducted no experiments.
Limited though our experiments have been, they do illustrate certain general and important properties of problem-solving methods based upon iterative or recursive search.
First, it is clear that if we insist that the search locates the optimal rather than merely a good solution, then we are likely to pay a very heavy price in extra computation.
Second, if we are content with a sub-optimal solution, then the more work we do, the better solution we may expect to obtain. The precise form of this relationship can be of great importance in operational research contexts when solution costs must be balanced against savings resulting from slightly better solutions.
Third, almost any problem can be formulated and then attacked, in a variety of different ways. We have exhibited four different ways of setting about our problem. We could easily suggest many more including, for example, sub-optimal versions of the branch-and-bound and integer programming approaches. Some formulations will be economical and will throw into prominence the crucial features of the problem. Others will confuse by obscuring the essentials with irrelevancies. Some methods of attack will be efficient, others not.
Finally, any problem-solving procedure is likely to be effective on particular problems in inverse relation to the range of problems to which it can be applied. This is because general procedures are rarely capable of recognising and exploiting particular features of specific problems. Thus integer linear programming is a very general methodology. But it is inefficient when applied to our problem because it fails to exploit the simple combinatorial structure which the problem possesses.
Simple though these points are they are not always brought out in the literature. To bring them clearly into focus one needs both a simply stated problem which can be tackled in a variety of ways, and an interdisciplinary and stimulating environment. We are fortunate in having had both.
1. Doran, J. (1971). Computer analysis of data from the la Tene cemetery at Mlinsingen-Rain, Mathematics in the Archaeological and Historical Sciences, ed. Hodson, F. R., Kendall, D. G., and Tautu, P., Edinburgh: Edinburgh University Press.
2. Fulkerson, D. R., and Gross, O. A. (1965). Incidence matrices and interval graphs, Pacific Journal of Mathematics, 15(3), 835-855.
3. Hodson, F. R., Kendall, D. G., and Tautu, P. (eds) (1971). Mathematics in the Archaeological and Historical Sciences, Edinburgh: Edinburgh University Press.
4. Hooke, R., and Jeeves, T. A. (1961). Direct search solution of numerical and statistical problems, J. ACM, 12(1), 71-82.
5. Lawler, E. L., and Wood, D. E. (1966). Branch-and-bound methods: a survey, Operations Research, 14, 699-719.
6. Mitten, L. G. (1970). Branch-and-bound methods: general formulation and properties, Operations Research, 18, 24-34.