Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

new BigInteger(String) performance / complexity

I'm wondering about the performance/complexity of constructing BigInteger objects with the new BigInteger(String) constructor.

Consider the following method:

  public static void testBigIntegerConstruction()
  {
    for (int exp = 1; exp < 10; exp++)
    {
      StringBuffer bigNumber = new StringBuffer((int) Math.pow(10.0, exp));
      for (int i = 0; i < Math.pow(10.0, exp - 1); i++)
      {
        bigNumber.append("1234567890");
      }

      String val = bigNumber.toString();
      long time = System.currentTimeMillis();
      BigInteger bigOne = new BigInteger(val);
      System.out.println("time for constructing a 10^" + exp
          + " digits BigInteger : " + ((System.currentTimeMillis() - time))
          + " ms");
    }
  }

This method creates BigInteger objects of Strings with 10^x digits, where x=1 at the beginning, and it's increased with every iteration. It measures and outputs the time required for constructing the corresponding BigInteger object.

On my machine (Intel Core i5 660, JDK 6 Update 25 32 bit) the output is:

time for constructing a 10^1 digits BigInteger : 0 ms
time for constructing a 10^2 digits BigInteger : 0 ms
time for constructing a 10^3 digits BigInteger : 0 ms
time for constructing a 10^4 digits BigInteger : 16 ms
time for constructing a 10^5 digits BigInteger : 656 ms
time for constructing a 10^6 digits BigInteger : 59936 ms
time for constructing a 10^7 digits BigInteger : 6227975 ms

While ignoring the lines up to 10^5 (because of possible distortions introduced by (processor) caching effects, JIT-compilation etc), we can clearly see a complexity of O(n^2) here. Keeping in mind that every operation on a BigInteger creates a new one due to immutability, this is a major performance penality for huge numbers.

Questions:

  • Did I miss something?

  • Why is this the case?

  • Is this fixed in more recent JDKs?

  • Are there any alternatives?

UPDATE:

I did further measurements and I can confirm the statement from some of the answers:
It seems that BigInteger is optimized for subsequent numerical operations with the expense of higher construction costs for huge numbers which seems reasonable for me.

like image 289
MRalwasser Avatar asked Feb 07 '13 17:02

MRalwasser


2 Answers

Simplifying from the source somewhat, it's the case because in the "traditional" String parsing loop

for each digit y from left to right:
  x = 10 * x + y

you have the issue that 10 * x takes time linear in the length of x, unavoidably, and that length grows by more-or-less a constant factor for each digit, also unavoidably.

(The actual implementation is somewhat smarter than this -- it tries to parse an int's worth of binary digits at a time, and so the actual multiplier in the loop is more likely 1 or 2 billion -- but yeah, it's still quadratic overall.)

That said, a number with 10^6 digits is at least a googol, and that's bigger than any number I've heard of being used even for cryptographic purposes. You're parsing a string that takes two megabytes of memory. Yes, it'll take a while, but I suspect the JDK authors didn't see the point of optimizing for such a rare use case.

like image 178
Louis Wasserman Avatar answered Oct 09 '22 09:10

Louis Wasserman


The O(n^2) effort is caused by the decimal to binary conversion if the BigInteger is specified as decimal digits.

Also, 10^7 digits is a really huge number. For typical cryptographic algorithms like RSA you would deal with 10^3 to 10^4 digits. Most of the BigInteger operations are not optimized for such a large number of digits.

like image 27
Henry Avatar answered Oct 09 '22 08:10

Henry