Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Signed vs. unsigned integers for lengths/counts

People also ask

Can signed integers represent more values than unsigned integers?

Both can store 256 different values, but signed integers use half of their range for negative numbers, whereas unsigned integers can store positive numbers that are twice as large. An n-bit unsigned variable has a range of 0 to (2n)-1.

What is the difference between a signed and an unsigned integer?

A signed integer is a 32-bit datum that encodes an integer in the range [-2147483648 to 2147483647]. An unsigned integer is a 32-bit datum that encodes a nonnegative integer in the range [0 to 4294967295].

What is special about an unsigned integer?

Unsigned Integers (often called "uints") are just like integers (whole numbers) but have the property that they don't have a + or - sign associated with them. Thus they are always non-negative (zero or positive). We use uint's when we know the value we are counting will always be non-negative.

Is unsigned int better to use?

The Google C++ style guide recommends avoiding unsigned integers except in situations that definitely require it (for example: file formats often store sizes in uint32_t or uint64_t -- no point in wasting a signedness bit that will never be used).


C++ uses unsigned values because they need the full range. On a 32-bit system, the language should make it possible to have a 4 GB vector, not just a 2 GB one. (the OS might not allow you to use all 4 GB, but the language itself doesn't want to get in your way)

In .NET, unsigned integers aren't CLS-compliant. You can use them (in some .NET languages), but it limits portability and compatibility. So for the base class library, they only use signed integers.

However, these are both edge cases. For most purposes, a signed int is big enough. So as long as both offer the range you need, you can use both.

One advantage that signed integers sometimes have is that they make it easier to detect underflow. Suppose you're computing an array index, and because of some bad input, or perhaps a logic error in your program, you end up trying to access index -1.

With a signed integer, that is easy to detect. With unsigned, it would wrap around and become UINT_MAX. That makes it much harder to detect the error, because you expected a positive number, and you got a positive number.

So really, it depends. C++ uses unsigned because it needs the range. .NET uses signed because it needs to work with languages which don't have unsigned.

In most cases, both will work, and sometimes, signed may enable your code to detect errors more robustly.


It's natural to use unsigned types for counts and sizes unless we're in some context where they can be negative and yet be meaningful. My guess is that C++ follows this same logic of its elder brother C, in which strlen() returns size_t and malloc() takes size_t.

The problem in C++ (and C) with signed and unsigned integers is that you must know how they are converted to one another when you're using a mixture of the two kinds. Some advocate using signed ints for everything integer to avoid this issue of programmers' ignorance and inattention. But I think programmers must know how to use their tools of trade (programming languages, compilers, etc). Sooner or later they'll be bit by the conversion, if not in what they have written, then in what someone else has. It's unavoidable.

So, know your tools, choose what makes sense in your situation.


There's a few aspects here:

1) Max Values: typically the maximum value of an signed number is 1/2 that of the corresponding unsigned max value. For example in C, the max signed short value is 32767 whereas the max unsigned short value is 65535 (because 1/2 of the range isn't needed for the -ve numbers). So if your expecting lengths or counts that are going to be large an unsigned representation makes more sense.

2) Security: You can browse the net for integer overflow errors, but imagine code such as:

if (length <= 100)
{
  // do something with file
}

... then if 'length' is an signed value, you run the risk of 'length' being a -ve number (though malicious intent, some cast, etc) and the code not performing a you expected. I've seen this on a previous project where a sequence was incremented for each transaction, but when the signed integer we used got to max signed int value (2147483647) it suddenly became -ve after the next increment and our code couldn't handle it.

Just some things to think about, regardless of the underlying language/API considerations.