Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type for array indices: signed/unsigned integer avantages

Tags:

c++

c

In C++, the default size for array indices is size_t which is a 64 bits unsigned 64-bits integer on most x86-64 platforms. I am in the process of building my own std::vector class for my library for High Performance Computing (One of the main reason is that I want this class to be able to take ownership of a pointer, something std::vector does not offer). For the type of the array index, I am thinking of either using:

  • size_t
  • my own index_t that would be a signed int or a long signed int depending on my program

The advantages or using a signed integer over an unsigned one are numerous, such as

for (index_t i = 0; i < v.size() - 1; ++i)

works like it is supposer to (with an unsigned integer, this loop goes crazy when v is of size 0)

for (index_t i = v.size() - 1; i >= 0; --i)

works like it is supposed to, and many other avantages. In terms of performance, it even seems to be a little bit better as

a + 1 < b + 1

can be reduced to a < b with signed integer (overflow is undefined), and not in the case of unsigned integers. The only avantage performance wise seems to be that a /= 2 can be reduced to a shift operation with unsigned integers but not with signed one.

I am wondering why the C++ committee has decided to use an unsigned integer for size_t as it seems to introduce a lot of pain and only few advantages.

like image 489
InsideLoop Avatar asked Oct 30 '14 10:10

InsideLoop


People also ask

What is signed and unsigned data type?

The term "unsigned" in computer programming indicates a variable that can hold only positive numbers. The term "signed" in computer code indicates that a variable can hold negative and positive values. The property can be applied to most of the numeric data types including int, char, short and long.

What is unsigned integer type?

An unsigned integer is a 32-bit datum that encodes a nonnegative integer in the range [0 to 4294967295]. The signed integer is represented in twos complement notation. The most significant byte is 0 and the least significant is 3.

What is a unsigned data type?

An unsigned data type simply means that the data type will only hold positive values; negatives aren't allowed to be stored in the data type. Unsigned data types include int, char, short, and long.

How do you convert a signed integer to an unsigned integer?

To convert a signed integer to an unsigned integer, or to convert an unsigned integer to a signed integer you need only use a cast. For example: int a = 6; unsigned int b; int c; b = (unsigned int)a; c = (int)b; Actually in many cases you can dispense with the cast.


2 Answers

The motivation for using an unsigned type as index or size in the standard is based on constraints only relevant to 16 bit machines. The natural type for any integral type in C++ is int, and that's what should probably be used; as you've noticed, trying to use unsigned types as numerical values in C++ is fraught with problems. If you're worried about the sizes being so big that they don't fit into an int, ptrdiff_t would be appropriate; this is, after all, the type of the results of subtraction of pointers or iterators. (The fact that v.size() has a different type than v.end() - v.begin() is really a design flaw in the standard library.)

like image 140
James Kanze Avatar answered Nov 09 '22 23:11

James Kanze


For me, unsigned sizes always make the most sense, since you can't have -32 elements in an array it is very very scary to consider the size/length as a signed quantity all the time.

The corner cases you mention can be coded around, you can e.g. abort the loop before entering it if v is empty for the first case (which doesn't look all that common to begin with, iterating over all elements except the last?).

like image 42
unwind Avatar answered Nov 09 '22 23:11

unwind