Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to calculate the decimal length of an integer? (.NET)

I have some code that does a lot of comparisons of 64-bit integers, however it must take into account the length of the number, as if it was formatted as a string. I can't change the calling code, only the function.

The easiest way (besides .ToString().Length) is:

(int)Math.Truncate(Math.Log10(x)) + 1;

However that performs rather poorly. Since my application only sends positive values, and the lengths are rather evenly distributed between 2 and 9 (with some bias towards 9), I precomputed the values and have if statements:

static int getLen(long x) {
    if (x < 1000000) {
        if (x < 100) return 2;
        if (x < 1000) return 3;
        if (x < 10000) return 4;
        if (x < 100000) return 5;
        return 6;
    } else {
        if (x < 10000000) return 7;
        if (x < 100000000) return 8;
        if (x < 1000000000) return 9; 
        return (int)Math.Truncate(Math.Log10(x)) + 1; // Very uncommon
    }
}

This lets the length be computed with an average of 4 compares.

So, are there any other tricks I can use to make this function faster?

Edit: This will be running as 32-bit code (Silverlight).

Update:

I took Norman's suggestion and changed the ifs around a bit to result in an average of only 3 compares. As per Sean's comment, I removed the Math.Truncate. Together, this boosted things about 10%. Thanks!

like image 482
MichaelGG Avatar asked Mar 24 '09 23:03

MichaelGG


People also ask

How do you find the length of an integer number?

Perhaps the easiest way of getting the number of digits in an Integer is by converting it to String, and calling the length() method. This will return the length of the String representation of our number: int length = String. valueOf(number).

How do you get precision in C#?

You need to use NumberFormatInfo. NumberDecimalSeparator to get the correct decimal separator for the locale.

Which is the best datatype to store a number with maximum possible precision in C#?

The decimal type is the best for fixed point math. From the C# Spec Section 4.1.


2 Answers

Two suggestions:

  1. Profile and put the common cases first.
  2. Do a binary search to minimize the number of comparions in the worst case. You can decide among 8 alternatives using exactly three comparisons.

This combination probably doesn't buy you much unless the distribution is very skew.

like image 114
Norman Ramsey Avatar answered Oct 13 '22 11:10

Norman Ramsey


From Sean Anderson's Bit Twiddling Hacks:

Find integer log base 10 of an integer

unsigned int v; // non-zero 32-bit integer value to compute the log base 10 of 
int r;          // result goes here
int t;          // temporary

static unsigned int const PowersOf10[] = 
    {1, 10, 100, 1000, 10000, 100000,
     1000000, 10000000, 100000000, 1000000000};

t = (IntegerLogBase2(v) + 1) * 1233 >> 12; // (use a lg2 method from above)
r = t - (v < PowersOf10[t]);

The integer log base 10 is computed by first using one of the techniques above for finding the log base 2. By the relationship log10(v) = log2(v) / log2(10), we need to multiply it by 1/log2(10), which is approximately 1233/4096, or 1233 followed by a right shift of 12. Adding one is needed because the IntegerLogBase2 rounds down. Finally, since the value t is only an approximation that may be off by one, the exact value is found by subtracting the result of v < PowersOf10[t].

This method takes 6 more operations than IntegerLogBase2. It may be sped up (on machines with fast memory access) by modifying the log base 2 table-lookup method above so that the entries hold what is computed for t (that is, pre-add, -mulitply, and -shift). Doing so would require a total of only 9 operations to find the log base 10, assuming 4 tables were used (one for each byte of v).

As far as computing IntegerLogBase2, there are several alternatives presented on that page. It's a great reference for all sorts of highly optimized integer operations.

A variant of your version is also there, except it assumes the values (rather than the log base 10 of the values) are uniformly distributed, and therefore does an exponentially ordered search:

Find integer log base 10 of an integer the obvious way

unsigned int v; // non-zero 32-bit integer value to compute the log base 10 of 
int r;          // result goes here

r = (v >= 1000000000) ? 9 : (v >= 100000000) ? 8 : (v >= 10000000) ? 7 : 
    (v >= 1000000) ? 6 : (v >= 100000) ? 5 : (v >= 10000) ? 4 : 
    (v >= 1000) ? 3 : (v >= 100) ? 2 : (v >= 10) ? 1 : 0;

This method works well when the input is uniformly distributed over 32-bit values because 76% of the inputs are caught by the first compare, 21% are caught by the second compare, 2% are caught by the third, and so on (chopping the remaining down by 90% with each comparision). As a result, less than 2.6 operations are needed on average.

like image 34
Trevor Robinson Avatar answered Oct 13 '22 12:10

Trevor Robinson