Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct type for array indexes in C?

What type for array index in C99 should be used? It have to work on LP32, ILP32, ILP64, LP64, LLP64 and more. It doesn't have to be a C89 type.

I have found 5 candidates:

  • size_t
  • ptrdiff_t
  • intptr_t / uintptr_t
  • int_fast*_t / uint_fast*_t
  • int_least*_t / uint_least*_t

There is simple code to better illustrate problem. What is the best type for i and j in these two particular loops. If there is a good reason, two different types are fine too.

for (i=0; i<imax; i++) {         do_something(a[i]); } /* jmin can be less than 0 */ for (j=jmin; j<jmax; j++) {         do_something(a[j]); } 

P.S. In the first version of question I had forgotten about negative indexes.

P.P.S. I am not going to write a C99 compiler. However any answer from a compiler programmer would be very valuable for me.

Similar question:

  • size_t vs. uintptr_t
    The context of this question if different though.
like image 273
Michas Avatar asked Jul 04 '10 13:07

Michas


People also ask

What type is an array index in C?

An array is an indexed collection of component variables, called the elements of the array. The indexes are the values of an ordinal type, called the index type of the array. The elements all have the same size and the same type, called the element type of the array. There are two kinds of array types, fixed and open.

What is the index of an array?

The index indicates the position of the element within the array (starting from 1) and is either a number or a field containing a number.

What is index in C?

The index() function locates the first occurrence of c (converted to an unsigned char) in the string pointed to by string. The character c can be the NULL character (\0); the ending NULL is included in the search. The string argument to the function must contain a NULL character (\0) marking the end of the string.

What is array index with example?

An array is an ordered list of values that you refer to with a name and an index. For example, consider an array called emp , which contains employees' names indexed by their numerical employee number. So emp[0] would be employee number zero, emp[1] employee number one, and so on.


2 Answers

I think you should use ptrdiff_t for the following reasons

  • Indices can be negative. Therefore for a general statement, all unsigned types, including size_t, are unsuitable.
  • The type of p2 - p1 is ptrdiff_t. If i == p2 - p1, then you should be able to get p2 back by p2 == p1 + i. Notice that *(p + i) is equivalent to p[i].
  • As another indication for this "general index type", the type of the index that's used by overload resolution when the builtin operator[] (for example, on a pointer) competes against a user-provided operator[] (for example vector's) is exactly that (http://eel.is/c++draft/over.built#16): >

    For every cv-qualified or cv-unqualified object type T there exist candidate operator functions of the form

    T*      operator+(T*, std::ptrdiff_t); T&      operator[](T*, std::ptrdiff_t); T*      operator-(T*, std::ptrdiff_t); T*      operator+(std::ptrdiff_t, T*); T&      operator[](std::ptrdiff_t, T*); 

EDIT: If you have a really big array or a pointer to a really big memory portion, then my "general index type" doesn't cut it, as it then isn't guaranteed that you can subtract the first element's address from the last element's address. @Ciro's answer should be used then https://stackoverflow.com/a/31090426/34509 . Personally I try to avoid using unsigned types for their non-ability to represent negative edge cases (loop end-values when iterating backwards for example), but this is a kind of religious debate (I'm not alone in that camp, though). In cases where using an unsigned type is required, I must put my religion aside, of course.

like image 85
Johannes Schaub - litb Avatar answered Oct 02 '22 15:10

Johannes Schaub - litb


I almost always use size_t for array indices/loop counters. Sure there are some special instances where you may want signed offsets, but in general using a signed type has a lot of problems:

The biggest risk is that if you're passed a huge size/offset by a caller treating things as unsigned (or if you read it from a wrongly-trusted file), you may interpret it as a negative number and fail to catch that it's out of bounds. For instance if (offset<size) array[offset]=foo; else error(); will write somewhere it shouldn't.

Another problem is the possibility of undefined behavior with signed integer overflow. Whether you use unsigned or signed arithmetic, there are overflow issues to be aware of and check for, but personally I find the unsigned behavior a lot easier to deal with.

Yet another reason to use unsigned arithmetic (in general) - sometimes I'm using indices as offsets into a bit array and I want to use %8 and /8 or %32 and /32. With signed types, these will be actual division operations. With unsigned, the expected bitwise-and/bitshift operations can be generated.

like image 27
R.. GitHub STOP HELPING ICE Avatar answered Oct 02 '22 15:10

R.. GitHub STOP HELPING ICE