Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert int to char[] without generating garbage in C#

Tags:

c#

.net

Doubtless this seems like a strange request, given the availability of ToString() and Convert.ToString(), but I need to convert an unsigned integer (i.e. UInt32) to its string representation, but I need to store the answer into a char[].

The reason is that I am working with character arrays for efficiency, and as the target char[] is initialised as a member to char[10] (to hold the string representation of UInt32.MaxValue) on object creation, it should be theoretically possible to do the conversion without generating any garbage (by which I mean without generating any temporary objects in the managed heap.)

Can anyone see a neat way to achieve this?

(I'm working in Framework 3.5SP1 in case that is any way relevant.)

like image 343
SteveWilkinson Avatar asked Mar 06 '11 14:03

SteveWilkinson


2 Answers

Further to my comment above, I wondered if log10 was too slow, so I wrote a version that doesn't use it.

For four digit numbers this version is about 35% quicker, falling to about 16% quicker for ten digit numbers.

One disadvantage is that it requires space for the full ten digits in the buffer.

I don't swear it doesn't have any bugs!

public static int ToCharArray2(uint value, char[] buffer, int bufferIndex)
{
    const int maxLength = 10;

    if (value == 0)
    {
        buffer[bufferIndex] = '0';
        return 1;
    }

    int startIndex = bufferIndex + maxLength - 1;
    int index = startIndex;
    do
    {
        buffer[index] = (char)('0' + value % 10);
        value /= 10;
        --index;
    }
    while (value != 0);

    int length = startIndex - index;

    if (bufferIndex != index + 1)
    {
        while (index != startIndex)
        {
            ++index;
            buffer[bufferIndex] = buffer[index];
            ++bufferIndex;
        }
    }

    return length;
}

Update

I should add, I'm using a Pentium 4. More recent processors may calculate transcendental functions faster.

Conclusion

I realised yesterday that I'd made a schoolboy error and run the benchmarks on a debug build. So I ran them again but it didn't actually make much difference. The first column shows the number of digits in the number being converted. The remaining columns show the times in milliseconds to convert 500,000 numbers.

Results for uint:

    luc1   arx henk1  luc3 henk2  luc2
 1   715   217   966   242   837   244
 2   877   420  1056   541   996   447
 3  1059   608  1169   835  1040   610
 4  1184   795  1282  1116  1162   801
 5  1403   969  1405  1396  1279   978
 6  1572  1149  1519  1674  1399  1170
 7  1740  1335  1648  1952  1518  1352
 8  1922  1675  1868  2233  1750  1545
 9  2087  1791  2005  2511  1893  1720
10  2263  2103  2139  2797  2012  1985

Results for ulong:

    luc1   arx henk1  luc3 henk2  luc2
 1   802   280   998   390   856   317
 2   912   516  1102   729   954   574
 3  1066   746  1243  1060  1056   818
 4  1300  1141  1362  1425  1170  1210
 5  1557  1363  1503  1742  1306  1436
 6  1801  1603  1612  2233  1413  1672
 7  2269  1814  1723  2526  1530  1861
 8  2208  2142  1920  2886  1634  2149
 9  2360  2376  2063  3211  1775  2339
10  2615  2622  2213  3639  2011  2697
11  3048  2996  2513  4199  2244  3011
12  3413  3607  2507  4853  2326  3666
13  3848  3988  2663  5618  2478  4005
14  4298  4525  2748  6302  2558  4637
15  4813  5008  2974  7005  2712  5065
16  5161  5654  3350  7986  2994  5864
17  5997  6155  3241  8329  2999  5968
18  6490  6280  3296  8847  3127  6372
19  6440  6720  3557  9514  3386  6788
20  7045  6616  3790 10135  3703  7268

luc1: Lucero's first function

arx: my function

henk1: Henk's function

luc3 Lucero's third function

henk2: Henk's function without the copy to the char array; i.e. just test the performance of ToString().

luc2: Lucero's second function

The peculiar order is the order they were created in.

I also ran the test without henk1 and henk2 so there would be no garbage collection. The times for the other three functions were nearly identical. Once the benchmark had gone past three digits the memory use was stable: so GC was happening during Henk's functions and didn't have a detrimental effect on the other functions.

Conclusion: just call ToString()

like image 67
arx Avatar answered Sep 29 '22 11:09

arx


The following code does it, with the following caveat: it does not respect the culture settings, but always outputs normal decimal digits.

public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
    if (value == 0) {
        buffer[bufferIndex] = '0';
        return 1;
    }
    int len = (int)Math.Ceiling(Math.Log10(value));
    for (int i = len-1; i>= 0; i--) {
        buffer[bufferIndex+i] = (char)('0'+(value%10));
        value /= 10;
    }
    return len;
}

The returned value is how much of the char[] has been used.

Edit (for arx): the following version avoids the floating-point math and swaps the buffer in-place:

public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
    if (value == 0) {
        buffer[bufferIndex] = '0';
        return 1;
    }
    int bufferEndIndex = bufferIndex;
    while (value > 0) {
        buffer[bufferEndIndex++] = (char)('0'+(value%10));
        value /= 10;
    }
    int len = bufferEndIndex-bufferIndex;
    while (--bufferEndIndex > bufferIndex) {
        char ch = buffer[bufferEndIndex];
        buffer[bufferEndIndex] = buffer[bufferIndex];
        buffer[bufferIndex++] = ch;
    }
    return len;
}

And here yet another variation which computes the number of digits in a small loop:

public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
    if (value == 0) {
        buffer[bufferIndex] = '0';
        return 1;
    }
    int len = 1;
    for (uint rem = value/10; rem > 0; rem /= 10) {
        len++;
    }
    for (int i = len-1; i>= 0; i--) {
        buffer[bufferIndex+i] = (char)('0'+(value%10));
        value /= 10;
    }
    return len;
}

I leave the benchmarking to whoever wants to do it... ;)

like image 37
Lucero Avatar answered Sep 29 '22 11:09

Lucero