An integer measures 4 bytes. In my example I have numbers measuring 1 MB. How can I convert them to a human readable decimal number fast? The number is present in <code>uint[]</code> Array containing <code>Size</code> items.

I've been thinking about your problem. I don't have a solution coded up, but here is an approach: First off, let's assume without loss of generality that you have a collection of 2n bits. (If you have fewer than exactly 2n bits, pad out the bit array with leading zeros until you do. Obviously doing so never more than doubles the size of the array.) In your case you say you have a million uints, so that's 225 bits. Let's also assume that every collection of 2k + 1 bits can be split evenly into two collections of bits, the left and right collections, each with 2k bits. Thus every bit collection, or sub-collection, has a "size" which is an exact power of two. The smallest possible collection contains a single bit and cannot be further subdivided. Second, let's assume that you similarly have an immutable representation of a number in decimal form, and that again, without loss of generality, there are 2d decimal digits in the string. If there are fewer than exactly 2d, again, pad with leading zeros. Again, each decimal collection of size greater than one can be split into two collections each half the size. Now we sketch out a recursive algorithm: <pre class="prettyprint"><code>DigitCollection Convert(BitCollection bits) { if (bits.Size == 1) return bits[0] ? DigitCollection.SingleOne : DigitCollection.SingleZero; DigitCollection left = Convert(bits.Left); DigitCollection right = Convert(bits.Right); DigitCollection power = MakePowerOfTwo(bits.Right.Size); return Add(Multiply(left, power), right); } </code></pre> We now need methods Add, Multiply and MakePowerOfTwo. As we'll see in a moment, we'll also need Subtract, and two Shift operators, for rapidly multiplying by powers of ten. Addition and subtraction are easy. Clearly if the longer collection contains n digits then addition and subtraction methods can be implemented to take O(n) time. FullShift and HalfShift operators make new digit collections out of old ones to facilitate rapid multiplication by powers of ten. If a digit collection of size 2d+1 consists of sub-collections (X1, X2) each of size 2d then the "half-shifted" collection contains 2d+2 items and consists of ( (2d leading zeros, X1), (X2, 2d trailing zeros)). The full-shift collection consists of ((X1, X2), (2d+1 trailing zeros)). These are obviously very cheap to construct. Multiplication is where we run into big problems. Suppose without loss of generality that we are multiplying together two DigitCollections each with exactly 2d digits. We can use Karatsuba's Algorithm: <pre class="prettyprint"><code>DigitCollection Multiply(DigitCollection x, DigitCollection y) { // Assume x and y are the same digit size. if (x and y are both single digits) return the trivial solution; // Assume that z2, z1 and z0 are all the same digit size as x and y. DigitCollection z2 = Multiply(x.Left, y.Left); DigitCollection z0 = Multiply(x.Right, y.Right); DigitCollection z1 = Subtract( Multiply(Add(x.Left, x.Right), Add(y.Left, y.Right)), Add(z2, z0)); return Add(Add(FullShift(z2), HalfShift(z1)), z0); } </code></pre> What is the order of this algorithm? Suppose there are n = 2d digits. What is O(Multiply(n))? We recurse three times, each with a problem with half as many digits. The rest of the add, subtract and shift operations are each no more than O(n). So we have a recurrance: <pre class="prettyprint"><code>T(n) = 3 T(n/2) + O(n) </code></pre> Which has an easy solution via the Master Theorem: this algorithm is O(n1/lg 3) which is about O(n1.58). What about MakePowerOfTwo? That's easy given what we already have. We use the identity: <blockquote> 22n + 1 = 2n x 2n + 2n x 2n </blockquote> and write the algorithm: <pre class="prettyprint"><code>DigitCollection MakePowerOfTwo(int p) { if (p == 0) return DigitCollection.SingleOne; DigitCollection power = MakePowerOfTwo(p / 2); power = Multiply(power, power); if (p % 2 != 0) power = Add(power, power); return power; } </code></pre> It is dominated by the computation of the multiplication, and so is O(n1.58) as well. And now we can see that the original call to Convert is also dominated by the multiplication. So with this algorithm if you have 2d binary digits to convert, you can expect it will take about O(21.58 d) steps to do so. In your case you have 225 bits to convert, so that should take about 777 billion calculations. The key fact here is that this algorithm is utterly dominated by the cost of the multiplication. If you can reduce the cost of the multiplication to less than O(n1.58) then you get huge wins. If I were you I would be studying improvements to decimal multiplication algorithms over Karatsuba.

I don't know if this is any faster, but here is an example in delphi I wrote a long time ago to handle big ints as strings (VERY quick and dirty) - this was for 128bit uint but you could extend it indefinitely <pre class="prettyprint"><code>Function HexToBinShort(hex:integer):string; begin case hex of 0: result:='0000'; //convert each hex digit to binary string 1: result:='0001'; //could do this with high-nybble and low nybble 2: result:='0010'; //of each sequential byte in the array (mask and bit shift) 3: result:='0011'; //ie: binstring:=binstring + HexToBinShort(nybble[i]) 4: result:='0100'; //but must count DOWN for i (start with MSB!) 5: result:='0101'; 6: result:='0110'; 7: result:='0111'; 8: result:='1000'; 9: result:='1001'; 10: result:='1010'; 11: result:='1011'; 12: result:='1100'; 13: result:='1101'; 14: result:='1110'; 15: result:='1111'; end; end; </code></pre> Then take the concatenated binary string and add powers of two each time you see a '1' <pre class="prettyprint"><code>Function BinToIntString(binstring:string):string; var i, j : integer; var calcHold, calc2 :string; begin calc2:=binstring[Length(binstring)]; // first bit is easy 0 or 1 for i := (Length(binstring) - 1) downto 1 do begin if binstring[i] = '1' then begin calcHold:=generateCard(Length(binstring)-i); calc2 := AddDecimalStrings(calcHold, calc2); end; end; result:=calc2; end; </code></pre> generateCard is used to create a decimal string representation of 2^i (for i>0) <pre class="prettyprint"><code>Function generateCard(i:integer):string; var j : integer; var outVal : string; begin outVal := '2'; if i > 1 then begin for j := 2 to i do begin outVal:= MulByTwo(outVal); end; end; result := outVal; end; </code></pre> and MulByTwo multiplies a decimal string by two <pre class="prettyprint"><code>Function MulByTwo(val:string):string; var i : integer; var carry, hold : integer; var outHold : string; var outString :string; var outString2 : string; begin outString:= StringOfChar('0', Length(val) + 1); outString2:= StringOfChar('0', Length(val)); carry :=0; for i := Length(val) downto 1 do begin hold := StrToInt(val[i]) * 2 + carry; if hold >= 10 then begin carry := 1; hold := hold - 10; end else begin carry := 0; end; outHold := IntToStr(hold); outString[i+1] := outHold[1]; end; if carry = 1 then begin outString[1] := '1'; result := outString; end else begin for i := 1 to length(outString2) do outString2[i]:=outString[i+1]; result := outString2; end; end; </code></pre> And finally - AddDecimalStrings...well, it adds two decimal strings : <pre class="prettyprint"><code>Function AddDecimalStrings(val1, val2:string):string; var i,j :integer; var carry, hold, largest: integer; var outString, outString2, bigVal, smVal, outHold:string; begin if Length(val1) > Length(val2) then begin largest:= Length(val1); bigVal := val1; smVal := StringOfChar('0', largest); j:=1; for i := (largest - length(val2) +1) to largest do begin smVal[i] := val2[j]; j:=j+1; end; end else begin if length(val2) > Length(val1) then begin largest:=Length(val2); bigVal:=val2; smVal := StringOfChar('0', largest); j:=1; for i := (largest - length(val1) +1) to largest do begin smVal[i] := val1[j]; j:=j+1; end; end else begin largest:=length(val1); bigVal:=val1; smVal:=val2; end; end; carry:=0; outString:=StringOfChar('0', largest +1); outString2:=StringOfChar('0', largest); for i := largest downto 1 do begin hold := StrToInt(bigVal[i]) + StrToInt(smVal[i]) + carry; if hold >=10 then begin carry:=1; hold := hold - 10; end else begin carry:=0; end; outHold:= IntToStr(hold); outString[i+1]:=outHold[1]; end; if carry = 1 then begin outString[1] := '1'; result := outString; end else begin for i := 1 to length(outString2) do outString2[i]:=outString[i+1]; result := outString2; end; end; </code></pre> These functions allow you to perform basic arithmetic on almost arbitrarily large integers as strings. You hit another wall when the number of digits is too big to index an array with, of course. Here's a divide by two, btw (useful for going the other way...). I don't handle odd numbers here. <pre class="prettyprint"><code>Function DivByTwo(val:string):string; var i : integer; var hold : integer; var outHold : string; var outString, outString2 :string; begin outString:=StringOfChar('0',Length(val)); for i := Length(val) downto 1 do begin if StrToInt(val[i]) mod 2 = 0 then begin hold:= Math.Floor(StrToInt(val[i]) / 2); outHold:= IntToStr(hold); outString[i]:=outHold[1]; end else begin hold:= Math.Floor((StrToInt(val[i]) - 1) / 2); outHold:=IntToStr(hold); outString[i]:= outHold[1]; if i <> Length(val) then begin hold:= StrToInt(outString[i+1]) + 5; outHold:= IntToStr(hold); outString[i+1] := outHold[1]; end; end; end; outString2:=StringOfChar('0',Length(val)-1); if (outString[1] = '0') and (length(outString) > 1) then begin for i := 1 to length(outString2) do outString2[i]:=outString[i+1]; result:=outString2; end else begin result:=outString; end; end; </code></pre> EDIT: I just tried this with a 9million bit long binary string and it is ludicrously slow! Not surprising, really. This is completely unoptimized code which has a lot of low hanging fruit to pick at for speeding things up. Still, I can't help but feel that this is the kind (or scale) of problem that you would probably want to write, at least partially, in fully optimized assembly. The individual operations are small but they must be done many times - that spells begging for assembly. Multithreading could certainly be leveraged here too.

Convert very huge number to base 10 string

3 Answers

I've been thinking about your problem. I don't have a solution coded up, but here is an approach:

First off, let's assume without loss of generality that you have a collection of 2ⁿ bits. (If you have fewer than exactly 2ⁿ bits, pad out the bit array with leading zeros until you do. Obviously doing so never more than doubles the size of the array.) In your case you say you have a million uints, so that's 2²⁵ bits.

Let's also assume that every collection of 2^{k + 1} bits can be split evenly into two collections of bits, the left and right collections, each with 2^k bits.

Thus every bit collection, or sub-collection, has a "size" which is an exact power of two. The smallest possible collection contains a single bit and cannot be further subdivided.

Second, let's assume that you similarly have an immutable representation of a number in decimal form, and that again, without loss of generality, there are 2^d decimal digits in the string. If there are fewer than exactly 2^d, again, pad with leading zeros. Again, each decimal collection of size greater than one can be split into two collections each half the size.

Now we sketch out a recursive algorithm:

DigitCollection Convert(BitCollection bits)
{
    if (bits.Size == 1)
        return bits[0] ? DigitCollection.SingleOne : DigitCollection.SingleZero;
    DigitCollection left = Convert(bits.Left);        
    DigitCollection right = Convert(bits.Right);
    DigitCollection power = MakePowerOfTwo(bits.Right.Size);
    return Add(Multiply(left, power), right);
}

We now need methods Add, Multiply and MakePowerOfTwo. As we'll see in a moment, we'll also need Subtract, and two Shift operators, for rapidly multiplying by powers of ten.

Addition and subtraction are easy. Clearly if the longer collection contains n digits then addition and subtraction methods can be implemented to take O(n) time.

FullShift and HalfShift operators make new digit collections out of old ones to facilitate rapid multiplication by powers of ten. If a digit collection of size 2^d+1 consists of sub-collections (X1, X2) each of size 2^d then the "half-shifted" collection contains 2^d+2 items and consists of ( (2^d leading zeros, X1), (X2, 2^d trailing zeros)). The full-shift collection consists of ((X1, X2), (2^d+1 trailing zeros)).

These are obviously very cheap to construct.

Multiplication is where we run into big problems. Suppose without loss of generality that we are multiplying together two DigitCollections each with exactly 2^d digits. We can use Karatsuba's Algorithm:

DigitCollection Multiply(DigitCollection x, DigitCollection y)
{
    // Assume x and y are the same digit size.
    if (x and y are both single digits)
        return the trivial solution;
    // Assume that z2, z1 and z0 are all the same digit size as x and y.
    DigitCollection z2 = Multiply(x.Left, y.Left);
    DigitCollection z0 = Multiply(x.Right, y.Right);
    DigitCollection z1 = Subtract(
        Multiply(Add(x.Left, x.Right), Add(y.Left, y.Right)),
        Add(z2, z0));
    return Add(Add(FullShift(z2), HalfShift(z1)), z0);
}

What is the order of this algorithm? Suppose there are n = 2^d digits. What is O(Multiply(n))? We recurse three times, each with a problem with half as many digits. The rest of the add, subtract and shift operations are each no more than O(n). So we have a recurrance:

T(n) = 3 T(n/2) + O(n)

Which has an easy solution via the Master Theorem: this algorithm is O(n^{1/lg 3}) which is about O(n^1.58).

What about MakePowerOfTwo? That's easy given what we already have. We use the identity:

2^{2n + 1} = 2ⁿ x 2ⁿ + 2ⁿ x 2ⁿ

and write the algorithm:

DigitCollection MakePowerOfTwo(int p)
{
    if (p == 0) return DigitCollection.SingleOne;
    DigitCollection power = MakePowerOfTwo(p / 2);
    power = Multiply(power, power);
    if (p % 2 != 0)
        power = Add(power, power);
    return power;
}

It is dominated by the computation of the multiplication, and so is O(n^1.58) as well.

And now we can see that the original call to Convert is also dominated by the multiplication.

So with this algorithm if you have 2^d binary digits to convert, you can expect it will take about O(2^{1.58 d}) steps to do so. In your case you have 2²⁵ bits to convert, so that should take about 777 billion calculations.

The key fact here is that this algorithm is utterly dominated by the cost of the multiplication. If you can reduce the cost of the multiplication to less than O(n^1.58) then you get huge wins. If I were you I would be studying improvements to decimal multiplication algorithms over Karatsuba.

129

answered Sep 18 '22 23:09

Eric Lippert

I don't know if this is any faster, but here is an example in delphi I wrote a long time ago to handle big ints as strings (VERY quick and dirty) - this was for 128bit uint but you could extend it indefinitely

Function HexToBinShort(hex:integer):string;
begin
  case hex of
    0:  result:='0000';  //convert each hex digit to binary string
    1:  result:='0001';  //could do this with high-nybble and low nybble
    2:  result:='0010';  //of each sequential byte in the array (mask and bit shift)
    3:  result:='0011';  //ie: binstring:=binstring + HexToBinShort(nybble[i])
    4:  result:='0100';  //but must count DOWN for i (start with MSB!)
    5:  result:='0101';
    6:  result:='0110';
    7:  result:='0111';
    8:  result:='1000';
    9:  result:='1001';
    10: result:='1010';
    11: result:='1011';
    12: result:='1100';
    13: result:='1101';
    14: result:='1110';
    15: result:='1111';
  end;
end;

Then take the concatenated binary string and add powers of two each time you see a '1'

Function BinToIntString(binstring:string):string;
var i, j : integer;
var calcHold, calc2 :string;
begin
  calc2:=binstring[Length(binstring)];   // first bit is easy 0 or 1
  for i := (Length(binstring) - 1) downto 1 do begin       
    if binstring[i] = '1' then begin   
       calcHold:=generateCard(Length(binstring)-i);
       calc2 := AddDecimalStrings(calcHold, calc2);
    end;
  end;
  result:=calc2;
end;

generateCard is used to create a decimal string representation of 2^i (for i>0)

Function generateCard(i:integer):string;
var j : integer;
var outVal : string;
begin
  outVal := '2';
  if i > 1 then begin
    for j := 2 to i do begin
      outVal:= MulByTwo(outVal);
    end;
  end;
  result := outVal;
end;

and MulByTwo multiplies a decimal string by two

Function MulByTwo(val:string):string;
var i : integer;
var carry, hold : integer;
var outHold : string;
var outString :string;
var outString2 : string;
begin
  outString:= StringOfChar('0', Length(val) + 1);
  outString2:= StringOfChar('0', Length(val));
  carry :=0;
  for i := Length(val) downto 1 do begin
    hold := StrToInt(val[i]) * 2 + carry;
    if hold >= 10 then begin
      carry := 1;
      hold := hold - 10;
    end else begin
      carry := 0;
    end;
    outHold := IntToStr(hold);
    outString[i+1] := outHold[1];
  end;
  if carry = 1 then begin
    outString[1] := '1';
    result := outString;
  end else begin
    for i := 1 to length(outString2) do outString2[i]:=outString[i+1];
    result := outString2;
  end;
end;

And finally - AddDecimalStrings...well, it adds two decimal strings :

Function AddDecimalStrings(val1, val2:string):string;
var i,j :integer;
var carry, hold, largest: integer;
var outString, outString2, bigVal, smVal, outHold:string;
begin
  if Length(val1) > Length(val2) then begin
    largest:= Length(val1);
    bigVal := val1;
    smVal := StringOfChar('0', largest);
    j:=1;
    for i := (largest - length(val2) +1) to largest do begin
      smVal[i] := val2[j];
      j:=j+1;
    end;
  end else begin
    if length(val2) > Length(val1) then begin
      largest:=Length(val2);
      bigVal:=val2;
      smVal := StringOfChar('0', largest);
      j:=1;
      for i := (largest - length(val1) +1) to largest do begin
        smVal[i] := val1[j];
        j:=j+1;
      end;
    end else begin
      largest:=length(val1);
      bigVal:=val1;
      smVal:=val2;
    end;
  end;
  carry:=0;
  outString:=StringOfChar('0', largest +1);
  outString2:=StringOfChar('0', largest);

  for i := largest downto 1 do begin
    hold := StrToInt(bigVal[i]) + StrToInt(smVal[i]) + carry;
    if hold >=10 then begin
      carry:=1;
      hold := hold - 10;
    end else begin
      carry:=0;
    end;
    outHold:= IntToStr(hold);
    outString[i+1]:=outHold[1];
  end;

  if carry = 1 then begin
    outString[1] := '1';
    result := outString;
  end else begin
    for i := 1 to length(outString2) do outString2[i]:=outString[i+1];
    result := outString2;
  end;  
end;

These functions allow you to perform basic arithmetic on almost arbitrarily large integers as strings. You hit another wall when the number of digits is too big to index an array with, of course.

Here's a divide by two, btw (useful for going the other way...). I don't handle odd numbers here.

Function DivByTwo(val:string):string;
var i : integer;
var hold : integer;
var outHold : string;
var outString, outString2 :string;
begin
  outString:=StringOfChar('0',Length(val));

  for i := Length(val) downto 1 do begin
    if StrToInt(val[i]) mod 2 = 0 then begin
      hold:= Math.Floor(StrToInt(val[i]) / 2);
      outHold:= IntToStr(hold);
      outString[i]:=outHold[1];
    end else begin
      hold:= Math.Floor((StrToInt(val[i]) - 1) / 2);
      outHold:=IntToStr(hold);
      outString[i]:= outHold[1];
      if i <> Length(val) then begin
        hold:= StrToInt(outString[i+1]) + 5;
        outHold:= IntToStr(hold);
        outString[i+1] := outHold[1];
      end;
    end;
  end;
  outString2:=StringOfChar('0',Length(val)-1);
  if (outString[1] = '0') and (length(outString) > 1) then begin
    for i := 1 to length(outString2) do outString2[i]:=outString[i+1];
    result:=outString2;
  end else begin
    result:=outString;
  end;
end;

EDIT: I just tried this with a 9million bit long binary string and it is ludicrously slow! Not surprising, really. This is completely unoptimized code which has a lot of low hanging fruit to pick at for speeding things up. Still, I can't help but feel that this is the kind (or scale) of problem that you would probably want to write, at least partially, in fully optimized assembly. The individual operations are small but they must be done many times - that spells begging for assembly. Multithreading could certainly be leveraged here too.

answered Sep 18 '22 23:09

J...

You might be able to save some time by doing more than one digit at a time. if you do it, say, 100,000 at a time, it'll likely go at least a little faster than 10 at a time.

Mind you, it's still likely to be pretty painfully slow, but it'll save you some time.

It's conceivable that you could make it recursive, and speed it up that much more - get the rough square root of the number, rounded down to the nearest exponent of 10. div and mod by that number, and send the results back to the same function. Mind you, I'm not sure how you'd go about efficiently running a div or mod of that size, but if you can figure it out (and don't run out of memory) it's still bound to be more time-efficient than dividing it out a digit at a time.

Alternate version: give up on decimals - since the number's going to be way too large to make sense to any actual human readers anyway - and render the thing in hex. Still technically human-readable, but you can render it a byte at a time and save yourself a whole lot of heartache.

answered Sep 17 '22 23:09

Ben Barden

Related questions
                            
                                How can I throw an exception and add in my own message containing a key and a value?
                            
                                why we need EndInvoke() to return value of asynchronous call in delegate?
                            
                                MonoTouch.Dialog: Back item in NavigationBar
                            
                                LINQ: Searching a collection within a collection for a single object
                            
                                Open and read txt file in ASP
                            
                                Intern string literals misunderstanding?
                            
                                How to get index of current records in C# ASP.NET ListView in Render Time
                            
                                Check if an object's type is from a particular namespace
                            
                                Entity Framework, soft-deleting and queries
                            
                                How to run Run CellEndEdit only when Cell ValueChanged in DataGridView
                            
                                Return from a function using a breakpoint
                            
                                Formatting string to 10 characters
                            
                                entity framework code first - Union of the two fields into one collection
                            
                                Constrained generics with hierarchy in type parameter
                            
                                where not in () statement with lambda
                            
                                Representing Numbers in Exponential form
                            
                                How to get value from applicationSettings?
                            
                                How to check if Unicode character has diacritics in .Net?
                            
                                How to read the contents of an entire disk bit by bit
                            
                                CRM Retrieve entity - Error: Unable to cast object of type 'Microsoft.Xrm.Sdk.Entity' to type 'CRMEntities.List'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert very huge number to base 10 string

Tags:

c#

math

bytecode77

People also ask

3 Answers

Eric Lippert

J...

Ben Barden

Recent Activity

Donate For Us