In C++, say that:
uint64_t i; uint64_t j;
then i * j
will yield an uint64_t
that has as value the lower part of the multiplication between i
and j
, i.e., (i * j) mod 2^64
. Now, what if I wanted the higher part of the multiplication? I know that there exists an assembly instruction do to something like that when using 32 bit integers, but I am not familiar at all with assembly, so I was hoping for help.
What is the most efficient way to make something like:
uint64_t k = mulhi(i, j);
A 64-bit signed integer. It has a minimum value of -9,223,372,036,854,775,808 and a maximum value of 9,223,372,036,854,775,807 (inclusive).
There is no standard way for having data type greater than 64 bits. You should check the documentation of your systems, some of them define 128 bits integers. However, to really have flexible size integers, you should use an other representation, using an array for instance.
A 64-bit register can hold any of 264 (over 18 quintillion or 1.8×1019) different values. The range of integer values that can be stored in 64 bits depends on the integer representation used.
If you're using gcc and the version you have supports 128 bit numbers (try using __uint128_t) then performing the 128 multiply and extracting the upper 64 bits is likely to be the most efficient way of getting the result.
If your compiler doesn't support 128 bit numbers, then Yakk's answer is correct. However, it may be too brief for general consumption. In particular, an actual implementation has to be careful of overflowing 64 bit integers.
The simple and portable solution he proposes is to break each of a and b into 2 32-bit numbers and then multiply those 32 bit numbers using the 64 bit multiply operation. If we write:
uint64_t a_lo = (uint32_t)a; uint64_t a_hi = a >> 32; uint64_t b_lo = (uint32_t)b; uint64_t b_hi = b >> 32;
then it is obvious that:
a = (a_hi << 32) + a_lo; b = (b_hi << 32) + b_lo;
and:
a * b = ((a_hi << 32) + a_lo) * ((b_hi << 32) + b_lo) = ((a_hi * b_hi) << 64) + ((a_hi * b_lo) << 32) + ((b_hi * a_lo) << 32) + a_lo * b_lo
provided the calculation is performed using 128 bit (or greater) arithmetic.
But this problem requires that we perform all the calculcations using 64 bit arithmetic, so we have to worry about overflow.
Since a_hi, a_lo, b_hi, and b_lo are all unsigned 32 bit numbers, their product will fit in an unsigned 64 bit number without overflow. However, the intermediate results of the above calculation will not.
The following code will implement mulhi(a, b) when the mathemetics must be performed modulo 2^64:
uint64_t a_lo = (uint32_t)a; uint64_t a_hi = a >> 32; uint64_t b_lo = (uint32_t)b; uint64_t b_hi = b >> 32; uint64_t a_x_b_hi = a_hi * b_hi; uint64_t a_x_b_mid = a_hi * b_lo; uint64_t b_x_a_mid = b_hi * a_lo; uint64_t a_x_b_lo = a_lo * b_lo; uint64_t carry_bit = ((uint64_t)(uint32_t)a_x_b_mid + (uint64_t)(uint32_t)b_x_a_mid + (a_x_b_lo >> 32) ) >> 32; uint64_t multhi = a_x_b_hi + (a_x_b_mid >> 32) + (b_x_a_mid >> 32) + carry_bit; return multhi;
As Yakk points out, if you don't mind being off by +1 in the upper 64 bits, you can omit the calculation of the carry bit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With