My processor, a small 16-bit microcontroller with no FPU and integer math only has 16/16 division and 32/16 division which both take 18 cycles. At the moment I'm using a very slow software routine (~7,500 cycles) to do 64/32 division. Is there any way to use these division engines to calculate a 64/32 division? Similar to how I'm already using the 16x16 multiplier and adder to calculate 32x32 multiplies? I'm using C but can work with any general explanation on how it can be done... I'm hoping to target <200 cycles (if it's at all possible.)
While a 16-bit processor can simulate 32-bit arithmetic using double-precision operands, 32-bit processors are much more efficient. While 16-bit processors can use segment registers to access more than 64K elements of memory, this technique becomes awkward and slow if it must be used frequently.
16-bit Windows applications were designed to run under Windows 3.0 and 3.1, while 32-bit Windows applications were designed for Windows 95, 98, NT, and 2000. They are written to two different Application Program Interfaces (APIs) called "Win16" and "Win32".
See "Hacker's Delight", multiword division (pages 140-145).
The basic concept (going back to Knuth) is to think of your problem in base-65536 terms. Then you have a 4 digit by 2 digit division problem, with 2/1 digit division as a primitive.
The C code is here: https://github.com/hcs0/Hackers-Delight/blob/master/divmnu.c.txt
My copy of Knuth (The Art of Computer Programming) is at work, so I can't check it until Monday, but that would be my first source. It has a whole section on arithmetic.
edit: your post about "16/16 division and 32/16 division which both take 18 cycles." -- dsPICs have a conditional subtract operation in assembly. Consider using this as your computational primitive.
Also note that if X = XH * 232 + XL and D = DH * 216 + DL, then if you are looking for
(Q,R) = X/D where X = Q * D + R
where Q = QH * 216 + QL, R = RH * 216 + RL, then
XH * 232 + XL = DH * QH * 232 + (DL * QH + DH * QL) * 216 + (DL * QL) + RH * 216 + RL
This suggests (by looking at terms that are the high 32 bits) to use the following procedure, akin to long division:
Your 32-bit quotient is the pair (QH,QL), and 32-bit remainder is R3.
(This assumes that the quotient is not larger than 32-bit, which you need to know ahead of time, and can easily check ahead of time.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With