The Reduced Instruction Set of all chips in the ARM family - from the ARM2 to the StrongARM - includes weird and wonderful instructions like MLA (Multiply with Accumulate: multiply two registers and add the contents of a third to the result) and ASL (Arithmetic Shift Left: absolutely identical to the Logical Shift Left instruction). Later chips contain a few additional instructions. However, so far as I am aware, room has never been found in the instruction set for something that would have been very useful - a DIV instruction.
12 - 4 = 8 8 - 4 = 4 4 - 4 = 0
When I was very little, I was allowed to play with an old-fashioned mechanical calculating machine. The front of the machine had an array of vertical dials, like the ones in a combination lock, on which you set up the numbers you wanted to calculate, and there was a handle on one side which was wound away from you in order to add the current number to the total on the display or towards you in order to subtract it. In order to carry out division, it was necessary to set up the first number and then to subtract the second number from it repeatedly (counting how many times you turned the handle!), just as we did in the sum above. Obviously however, this could be extremely slow if you were doing a sum like 128÷4 since the answer is 32, this is the number of times you would have to turn the handle!
It would be quite possible to carry out division in ARM code using this simple technique, by constructing a loop like this:
MOV R1,#128 ;divide R1 MOV R2,#4 ;by R2 MOV R0,#0 ;initialise counter .subtract SUBS R1,R1,R2 ;subtract R2 from ;R1 and store ;result back in ;R1 setting flags ADD R0,R0,#1 ;add 1 to counter, ;NOT setting flags BHI subtract ;branch to start of ;loop on condition ;Higher, i.e. R1 is ;still greater than ;R2. Answer now in R0
Since even the slowest ARM processor is far faster than a child turning a handle attached to a stiff set of gears, this might even be an acceptable solution. But there is a trick you normally use to save work on a calculating machine - and you can use it on a computer, too.
Using our example of 128÷4, you would end up doing this:
We can no longer subtract 4 (our original number), so we have found the last digit of the answer to be 2. In other words, the answer is 32, the result we got before - but we have had to wind the handle round only five times in order to obtain it, not thirty-two times!
You will have noticed that this is almost exactly the method one uses when doing division using pen and paper... "four into one won't go... four into twelve goes three times... four into eight goes twice... answer is thirty-two." The difference is that we know, through experience, that 12 is 4 × 3 and 8 is 4 × 2. Machines, even electronic ones like computers, don't have this advantage - they don't contain a reference set of multiplication tables, so they have to do it the hard way.
Every time we shift our number one place to the right to obtain the next digit of the answer, we know that we will be able to subtract it either exactly once or not at all. We only have to carry out one subtraction per shift; the disadvantage is that since binary numbers have many more digits than their decimal equivalents we have to carry out many more shifts.
MOV R1,#128 MOV R0,R1,LSR#2 ;shift R1 2 places ;to the right & ;store in R0 ;answer now in R0
Since 4 is 2 × 2, all we have to do to divide by 4 in binary is to shift the register two places to the right, just as all we have to do to divide by 100 (10 × 10) in decimal is to shift two places to the right - e.g. from 600 pence we get 6 pounds.
To divide 50 (%110010) by 10 (%1010) in binary:
Our '10' has now been shifted back two places to the right, returning it to its original value, which is our signal to stop and count up the digits in our answer - %101 in binary or '5' in decimal, w ich is of course the correct answer.
CMP R2, #0 BEQ divide_end ;check for divide by zero!
MOV R0,#0 ;clear R0 to accumulate result MOV R3,#1 ;set bit 0 in R3, which will be ;shifted left then right .start CMP R2,R1 MOVLS R2,R2,LSL#1 MOVLS R3,R3,LSL#1 BLS start ;shift R2 left until it is about to ;be bigger than R1 ;shift R3 left in parallel in order ;to flag how far we have to go
R0 will be used to hold the result. The role of R3 is more complicated.
In effect, we are using R3 to mark where the right-hand end of R2 has got to - if we shift R2 three places left, this will be indicated by a value of %1000 in R3. However, we also add it to R0 every time we manage a successful subtraction, since it marks the position of the digit currently being calculated in the answer. In the binary example (50 ÷ 10) above, we shifted the '10' two places left, so at the time of the first subtraction, R3 would have been %100, at the time of the second (which failed) it would have been %10, and at the time of the third %1. Adding it to R0 after each successful subtraction would have given us, once again, the answer of %101!
ARM code doesn't have the specific shift and rotate instructions present in non-RISC instruction sets. Instead it has the concept of the 'barrel shifter' which can be used to modify the value specified by the right-hand register for almost any instruction, without altering that register itself. For example, the instruction ADD R0, R1, R2, LSL#2 will add together R1 and (R2<<2) and load the result into R0, without affecting the value of R2 in any way. This can be very useful, but it means that if we actually want to change the value of R2 by shifting it, as we do here, we have to resort to moving it into itself via the shifter: MOV R2,R2,LSL#1.
.next CMP R1,R2 ;carry set if R1>R2 (don't ask why) SUBCS R1,R1,R2 ;subtract R2 from R1 if this would ;give a positive answer ADDCS R0,R0,R3 ;and add the current bit in R3 to ;the accumulating answer in R0
In ARM code subtraction (a CMP instruction simulates a subtraction in order to set the flags), if R1 - R2 gives a positive answer and no 'borrow' is required, the carry flag is set. This is required in order to make SBC (Subtract with Carry) work properly when used to carry out a 64-bit subtraction, but it is confusing!
In this case, we are turning it to our advantage. The carry flag is set to indicate that a successful subtraction is possible, i.e. one that doesn't generate a negative result, and the two following instructions are carried out only when the condition Carry Set applies. Note that the 'S' on the end of these instructions is part of the 'CS' condition code and does not mean that they set the flags!
MOVS R3,R3,LSR#1 ;Shift R3 right into carry flag MOVCC R2,R2,LSR#1 ;and if bit 0 of R3 was zero, also ;shift R2 right BCC next ;If carry not clear, R3 has shifted ;back to where it started, and we ;can end .divide_end MOV R25, R24 ;exit routine
The next two instructions shift right R3, the 'counter' register, and R2, which holds the number we are dividing by. We specifically set the flags by using the 'S' suffix when shifting R3 since we want to know when the bit held in this register reaches the right-hand side. During a shift to the right, bit 0 is transferred to the carry flag while the rest of the bits move along. Since only one bit of R3 is set (it was originally loaded with %1 before being shifted left and then right), when the carry flag is set it indicates that, before the shift, the value of R3 was %1, i.e. we have shifted back to where we started and R0 should now hold the right answer.
As with the results of integer division in Basic, the value in R0 will always be rounded to the next lowest whole number rather than to the nearest number. For example, 1156 ÷ 19 gives a result of '60 remainder 16' which is actually closer to 61 than 60.
Source: | Archive Magazine 14.2 |
Publication: | Archive Magazine |
Contributor: | Harriet Bazley |