Doubtless this seems like a strange request, given the availability of ToString()
and Convert.ToString()
, but I need to convert an unsigned integer (i.e. UInt32
) to its string representation, but I need to store the answer into a char[]
.
The reason is that I am working with character arrays for efficiency, and as the target char[]
is initialised as a member to char[10]
(to hold the string representation of UInt32.MaxValue
) on object creation, it should be theoretically possible to do the conversion without generating any garbage (by which I mean without generating any temporary objects in the managed heap.)
Can anyone see a neat way to achieve this?
(I'm working in Framework 3.5SP1 in case that is any way relevant.)
Further to my comment above, I wondered if log10 was too slow, so I wrote a version that doesn't use it.
For four digit numbers this version is about 35% quicker, falling to about 16% quicker for ten digit numbers.
One disadvantage is that it requires space for the full ten digits in the buffer.
I don't swear it doesn't have any bugs!
public static int ToCharArray2(uint value, char[] buffer, int bufferIndex)
{
const int maxLength = 10;
if (value == 0)
{
buffer[bufferIndex] = '0';
return 1;
}
int startIndex = bufferIndex + maxLength - 1;
int index = startIndex;
do
{
buffer[index] = (char)('0' + value % 10);
value /= 10;
--index;
}
while (value != 0);
int length = startIndex - index;
if (bufferIndex != index + 1)
{
while (index != startIndex)
{
++index;
buffer[bufferIndex] = buffer[index];
++bufferIndex;
}
}
return length;
}
Update
I should add, I'm using a Pentium 4. More recent processors may calculate transcendental functions faster.
Conclusion
I realised yesterday that I'd made a schoolboy error and run the benchmarks on a debug build. So I ran them again but it didn't actually make much difference. The first column shows the number of digits in the number being converted. The remaining columns show the times in milliseconds to convert 500,000 numbers.
Results for uint:
luc1 arx henk1 luc3 henk2 luc2
1 715 217 966 242 837 244
2 877 420 1056 541 996 447
3 1059 608 1169 835 1040 610
4 1184 795 1282 1116 1162 801
5 1403 969 1405 1396 1279 978
6 1572 1149 1519 1674 1399 1170
7 1740 1335 1648 1952 1518 1352
8 1922 1675 1868 2233 1750 1545
9 2087 1791 2005 2511 1893 1720
10 2263 2103 2139 2797 2012 1985
Results for ulong:
luc1 arx henk1 luc3 henk2 luc2
1 802 280 998 390 856 317
2 912 516 1102 729 954 574
3 1066 746 1243 1060 1056 818
4 1300 1141 1362 1425 1170 1210
5 1557 1363 1503 1742 1306 1436
6 1801 1603 1612 2233 1413 1672
7 2269 1814 1723 2526 1530 1861
8 2208 2142 1920 2886 1634 2149
9 2360 2376 2063 3211 1775 2339
10 2615 2622 2213 3639 2011 2697
11 3048 2996 2513 4199 2244 3011
12 3413 3607 2507 4853 2326 3666
13 3848 3988 2663 5618 2478 4005
14 4298 4525 2748 6302 2558 4637
15 4813 5008 2974 7005 2712 5065
16 5161 5654 3350 7986 2994 5864
17 5997 6155 3241 8329 2999 5968
18 6490 6280 3296 8847 3127 6372
19 6440 6720 3557 9514 3386 6788
20 7045 6616 3790 10135 3703 7268
luc1: Lucero's first function
arx: my function
henk1: Henk's function
luc3 Lucero's third function
henk2: Henk's function without the copy to the char array; i.e. just test the performance of ToString().
luc2: Lucero's second function
The peculiar order is the order they were created in.
I also ran the test without henk1 and henk2 so there would be no garbage collection. The times for the other three functions were nearly identical. Once the benchmark had gone past three digits the memory use was stable: so GC was happening during Henk's functions and didn't have a detrimental effect on the other functions.
Conclusion: just call ToString()
The following code does it, with the following caveat: it does not respect the culture settings, but always outputs normal decimal digits.
public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
if (value == 0) {
buffer[bufferIndex] = '0';
return 1;
}
int len = (int)Math.Ceiling(Math.Log10(value));
for (int i = len-1; i>= 0; i--) {
buffer[bufferIndex+i] = (char)('0'+(value%10));
value /= 10;
}
return len;
}
The returned value is how much of the char[]
has been used.
Edit (for arx): the following version avoids the floating-point math and swaps the buffer in-place:
public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
if (value == 0) {
buffer[bufferIndex] = '0';
return 1;
}
int bufferEndIndex = bufferIndex;
while (value > 0) {
buffer[bufferEndIndex++] = (char)('0'+(value%10));
value /= 10;
}
int len = bufferEndIndex-bufferIndex;
while (--bufferEndIndex > bufferIndex) {
char ch = buffer[bufferEndIndex];
buffer[bufferEndIndex] = buffer[bufferIndex];
buffer[bufferIndex++] = ch;
}
return len;
}
And here yet another variation which computes the number of digits in a small loop:
public static int ToCharArray(uint value, char[] buffer, int bufferIndex) {
if (value == 0) {
buffer[bufferIndex] = '0';
return 1;
}
int len = 1;
for (uint rem = value/10; rem > 0; rem /= 10) {
len++;
}
for (int i = len-1; i>= 0; i--) {
buffer[bufferIndex+i] = (char)('0'+(value%10));
value /= 10;
}
return len;
}
I leave the benchmarking to whoever wants to do it... ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With