Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do BigInteger implementations use sign-magnitude instead of two's complement?

Arbitary-precision signed integers are almost always implemented using a sign-magnitude representation:

  • (Java) BigInteger in OpenJDK
  • (Python) Bigint implementation of the Python built-in int type in CPython
  • (C) mpz_t in GMP, the GNU multiple precision arithmetic library
  • (C++) BigInteger in a bigint library by Matt McCutchen
  • (Rust) BigInt in the num-bigint library

The clear preference for sign-magnitude stands in contrast to the near-universal preference for two's complement in fixed-width signed integer types. The question is, why is sign-magnitude so clearly preferred for BigIntegers? (I welcome counterexamples if you disagree with the premise.)

Note that BigInteger APIs typically specify "as-if two's complement" semantics (e.g. Java, Python), for bitwise operations where it matters. This provides consistency with the usual meaning of those operations. This does not dictate the actual internal representation (merely an implementation detail), but it should be a point in favor of using two's complement internally, if all else were equal.

Floating-point numbers use sign-magnitude, unlike ints which use two's complement. Floating point is not really a guiding precedent here, though, since the behavior and algorithms for floating-point arithmetic are significantly different from integer arithmetic. Bignums are much more like ints than floats.

We know the "textbook" reasons for why two's complement works mathematically and why it has advantages. It seems to me that those reasons apply equally validly to both ints and BigIntegers. To what extent is that really true?

There is, of course, a vast difference between the design constraints of hardware fixed-precision integers and software arbitrary-precision integers. In this sense, it is not at all surprising to see that designers made different tradeoffs in these different domains. What, then, are the tradeoffs between sign-magnitude and two's complement, as applied to arbitrary-precision integers? For example, this could be in terms of the performance or simplicity of certain important algorithms.

I hope your answer will illuminate the design considerations that go into BigInteger arithmetic, as well as help me re-examine what I know about two's complement from a new perspective.

(To be clear: When I say two's complement for arbitrary-precision integers, I mean a representation using an array of words whose bit pattern, when put together, is the two's complement representation of the desired number - perhaps with the additional requirement that there be no "unnecessary leading 0's" (for nonnegative numbers) or "unnecessary leading 1's" (for negative numbers).)

like image 595
echinodermata Avatar asked Aug 28 '18 06:08

echinodermata


1 Answers

Two's complement makes add and subtract a bit simpler for equal length numbers, but makes multiply and divide more complicated. For hardware implementation, there may be a time penalty, but not always. Looking at X86 "Ivy Bridge" instruction table, the only case where two's complement shows up in taking more time is for a 128 bit signed dividend divided by a 64 bit signed divisor. So this is mostly an issue for software based math.

Big integer libraries may use more complex but faster representations for large numbers. Here are some links to example articles:

https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic

https://cp-algorithms.com/algebra/big-integer.html

http://www.apfloat.org/ntt.html

The more complex methods are mostly faster for fairly large numbers, for medium size numbers, simpler implementations will be faster.

like image 176
rcgldr Avatar answered Sep 23 '22 14:09

rcgldr