Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently convert between Hex, Binary, and Decimal in C/C++

I have 3 base representations for positive integer numbers:

  1. Decimal, in unsigned long variable (e.g. unsigned long int NumDec = 200).
  2. Hex, in string variable (e.g. string NumHex = "C8")
  3. Binary, in string variable (e.g. string NumBin = "11001000")

I want to be able to convert between numbers in all 3 representations in the most efficient way. I.e. to implement the following 6 functions:

unsigned long int Binary2Dec(const string & Bin) {}
unsigned long int Hex2Dec(const string & Hex) {}
string Dec2Hex(unsigned long int Dec) {}
string Binary2Hex(const string & Bin) {}
string Dec2Binary(unsigned long int Dec) {}
string Hex2Binary(const string & Hex) {}

What is the most efficient approach for each of them? I can use C and C++, but not boost.

Edit: By "efficiency" I mean time efficiency: Shortest execution time.

like image 287
Igor Avatar asked May 04 '09 09:05

Igor


5 Answers

As others have pointed out, I would start with sscanf(), printf() and/or strtoul(). They are fast enough for most applications, and they are less likely to have bugs. I will say, however, that these functions are more generic than you might expect, as they have to deal with non-ASCII character sets, with numbers represented in any base and so forth. For some domains it is possible to beat the library functions.

So, measure first, and if the performance of these conversion is really an issue, then:

1) In some applications / domains certain numbers appear very often, for example zero, 100, 200, 19.95, may be so common that it makes sense to optimize your functions to convert such numbers with a bunch of if() statements, and then fall back to the generic library functions. 2) Use a table lookup if the most common 100 numbers, and then fall back on a library function. Remember that large tables may not fit in your cache and may require multiple indirections for shared libraries, so measure these things carefully to make sure you are not decreasing performance.

You may also want to look at boost lexical_cast functions, though in my experience the latter are relatively compared to the good old C functions.

Tough many have said it, it is worth repeating over and over: do not optimize these conversions until you have evidence that they are a problem. If you do optimize, measure your new implementation to make sure it is faster and make sure you have a ton of unit tests for your own version, because you will introduce bugs :-(

like image 124
coryan Avatar answered Nov 19 '22 09:11

coryan


I would suggest just using sprintf and sscanf.

Also, if you're interested in how it's implemented you can take a look at the source code for glibc, the GNU C Library.

like image 4
Robert S. Barnes Avatar answered Nov 19 '22 10:11

Robert S. Barnes


Why do these routines have to be so time-efficient? That sort of claim always makes me wonder. Are you sure the obvious conversion methods like strtol() are too slow, or that you can do better? System functions are usually pretty efficient. They are sometimes slower to support generality and error-checking, but you need to consider what to do with errors. If a bin argument has characters other than '0' and '1', what then? Abort? Propagate massive errors?

Why are you using "Dec" to represent the internal representation? Dec, Hex, and Bin should be used to refer to the string representations. There's nothing decimal about an unsigned long. Are you dealing with strings showing the number in decimal? If not, you're confusing people here and are going to confuse many more.

The transformation between binary and hex text formats can be done quickly and efficiently, with lookup tables, but anything involving decimal text format will be more complicated.

like image 3
David Thornley Avatar answered Nov 19 '22 08:11

David Thornley


That depends on what you're optimizing for, what do you mean by "efficient"? Is it important that the conversions be fast, use little memory, little programmer time, fewer WTFs from other programmers reading the code, or what?

For readability and ease of implementation, you should at least implement both Dec2Hex() and Dec2Binary() by just calling strotul(). That makes them into one-liners, which is very efficient for at least some of the above interpretations of the word.

like image 2
unwind Avatar answered Nov 19 '22 09:11

unwind


Sounds very much like a homework problem, but what the heck...

The short answer is for converting from long int to your strings use two lookup tables. Each table should have 256 entries. One maps a byte to a hex string: 0 -> "00", 1 -> "01", etc. The other maps a byte to a bit string: 0 -> "00000000", 1 -> "00000001".

Then for each byte in your long int you just have to look up the correct string, and concatenate them.

To convert from strings back to long you can simply convert the hex string and the bit string back to a decimal number by multiplying the numeric value of each character by the appropriate power of 16 or 2, and summing up the results.

EDIT: You can also use the same lookup tables for backwards conversion by doing binary search to find the right string. This would take log(256) = 8 comparisons of your strings. Unfortunately I don't have time to do the analysis whether comparing strings would be much faster than multiplying and adding integers.

like image 2
Dima Avatar answered Nov 19 '22 09:11

Dima