I always use unsigned int for values that should never be negative. But today I noticed this situation in my code:
void CreateRequestHeader( unsigned bitsAvailable, unsigned mandatoryDataSize, unsigned optionalDataSize ) { If ( bitsAvailable – mandatoryDataSize >= optionalDataSize ) { // Optional data fits, so add it to the header. } // BUG! The above includes the optional part even if // mandatoryDataSize > bitsAvailable. }
Should I start using int instead of unsigned int for numbers, even if they can't be negative?
C and C++ are unusual amongst languages nowadays in making a distinction between signed and unsigned integers. An int is signed by default, meaning it can represent both positive and negative values.
Signed integers are stored in a computer using 2's complement. It consist both negative and positive values but in different formats like (-1 to -128) or (0 to +127) . An unsigned integer can hold a larger positive value, and no negative value like (0 to 255) . Unlike C++ there is no unsigned integer in Java.
An n-bit unsigned variable has a range of 0 to (2n)-1. When no negative numbers are required, unsigned integers are well-suited for networking and systems with little memory, because unsigned integers can store more positive numbers without taking up extra memory.
You simply cannot assign a negative value to an object of an unsigned type. Any such value will be converted to the unsigned type before it's assigned, and the result will always be >= 0.
One thing that hasn't been mentioned is that interchanging signed/unsigned numbers can lead to security bugs. This is a big issue, since many of the functions in the standard C-library take/return unsigned numbers (fread, memcpy, malloc etc. all take size_t
parameters)
For instance, take the following innocuous example (from real code):
//Copy a user-defined structure into a buffer and process it char* processNext(char* data, short length) { char buffer[512]; if (length <= 512) { memcpy(buffer, data, length); process(buffer); return data + length; } else { return -1; } }
Looks harmless, right? The problem is that length
is signed, but is converted to unsigned when passed to memcpy
. Thus setting length to SHRT_MIN
will validate the <= 512
test, but cause memcpy
to copy more than 512 bytes to the buffer - this allows an attacker to overwrite the function return address on the stack and (after a bit of work) take over your computer!
You may naively be saying, "It's so obvious that length needs to be size_t
or checked to be >= 0
, I could never make that mistake". Except, I guarantee that if you've ever written anything non-trivial, you have. So have the authors of Windows, Linux, BSD, Solaris, Firefox, OpenSSL, Safari, MS Paint, Internet Explorer, Google Picasa, Opera, Flash, Open Office, Subversion, Apache, Python, PHP, Pidgin, Gimp, ... on and on and on ... - and these are all bright people whose job is knowing security.
In short, always use size_t
for sizes.
Man, programming is hard.
Should I always ...
The answer to "Should I always ..." is almost certainly 'no', there are a lot of factors that dictate whether you should use a datatype- consistency is important.
But, this is a highly subjective question, it's really easy to mess up unsigneds:
for (unsigned int i = 10; i >= 0; i--);
results in an infinite loop.
This is why some style guides including Google's C++ Style Guide discourage unsigned
data types.
In my personal opinion, I haven't run into many bugs caused by these problems with unsigned data types — I'd say use assertions to check your code and use them judiciously (and less when you're performing arithmetic).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With