Tower of Hanoi


Page: 1 2 3 4 of 4	Yellow Pig's Index

The Tower of Hanoi Puzzle

Some of us have come to learn that recursion is slow and impractical. In the days when the computer had 64K or 256K of memory, recursive programs would run of out memory because they repeatedly push arguments to the runtime stack. Memory itself was slow. These machines did not have any processor caches. So if performance was a required, a program would have to be tuned to minimize memory access and do all its bookkeeping in registers. That was just impossible with recursive programs.

Our recursive programs ran out of memory quickly. Not having faith in our hardware engineers, we did whatever we could to come up with non-recursive solutions. Although the recursive program for the Tower of Hanoi puzzle was not very memory intensive, people still wanted non-recursive solutions. Here is a very short one:

void tower(unsigned int n) {
  unsigned int c, i, zeros, m = 1 << n;
  for(i=1; i < m; i++) {
    for(c=i, zeros=1; !(c&1); c>>=1, zeros++); 
    printf("%d: move disc %d from %d to %d\n", i, zeros,
           (i&i-1)%3, ((i|i-1)+1)%3);
  }
}

We are not going to explain much of this program because we would have to type up some proofs by induction. But we can still analyze its performance. The program uses four unsigned ints. It can be optimized to do all the processing in registers, except the call to printf. We ignore the cost of printf in our analysis and focus on the actual computation. In the following paragraphs, an operation refers to a mathematical concept while an instruction refers to a machine instruction. Unless explicitly noted, we assume the an operation can be implemented with one instruction, and that one instruction completes in one clock cycle. The former is practically true; but the latter is a simplification of reality.

First, we analyze the mysterious computation of zeros, which is actually one plus the number of ending zeros of the binary representation of i. If i=36, then zeros=3, because 36 is 10100₂, which ends in 2 zeros. How many times do we have to compute !(c&1), c>>=1 and zeros++? Half of the time, i is odd. So we compute only !(c&1), requiring two operations. If c is even, then we have to keep right-shifting until it becomes odd. Each iteration of this loop involves !(c&1), c>>=1 and zeros++, requiring four operations. Therefore, the total number of operations is: 2 · (2ⁿ - 1) + 4 · (1 · 2^n-2 + 2 · 2^n-3 + 3 · 2^n-4 + ... + (n-1) · 2^n-n), which equals 6 · 2ⁿ - 4 · n - 14. If we count more rigorously, when n is even, we need 2ⁿ - n - 3 jump instructions to complete the inner loops. And for all n, we need to jump into or out of the inner loop, requiring 2ⁿ-1 jump instructions.

There are 7 arithmetic operations in the computation of the arguments to printf. Adding the i++ and the i < m, we have a total of 9 operations for each of the 2ⁿ-1 steps. We also need one jump instruction for each iteration of the main loop. We make a final approximation of 15 · 2ⁿ arithmetic operations and 3 · 2ⁿ jump instructions for the entire procedure. (We count jump instructions separately because a "jump" in the program counter may screw up the instruction cache and flush the ALU pipeline in RISC processors. It is an instruction that may take more than one clock cycle).

Continue