This is a rather silly question but why is <code>int</code> commonly used instead of <code>unsigned int</code> when defining a for loop for an array in C or C++? <pre class="prettyprint"><code>for(int i;i<arraySize;i++){} for(unsigned int i;i<arraySize;i++){} </code></pre> I recognize the benefits of using <code>int</code> when doing something other than array indexing and the benefits of an iterator when using C++ containers. Is it just because it does not matter when looping through an array? Or should I avoid it all together and use a different type such as <code>size_t</code>?

Using <code>int</code> is more correct from a logical point of view for indexing an array. <code>unsigned</code> semantic in C and C++ doesn't really mean "not negative" but it's more like "bitmask" or "modulo integer". To understand why <code>unsigned</code> is not a good type for a "non-negative" number please consider these totally absurd statements: <ul> <li>Adding a possibly negative integer to a non-negative integer you get a non-negative integer</li> <li>The difference of two non-negative integers is always a non-negative integer</li> <li>Multiplying a non-negative integer by a negative integer you get a non-negative result</li> </ul> Obviously none of the above phrases make any sense... but it's how C and C++ <code>unsigned</code> semantic indeed works. Actually using an <code>unsigned</code> type for the size of containers is a design mistake of C++ and unfortunately we're now doomed to use this wrong choice forever (for backward compatibility). You may like the name "unsigned" because it's similar to "non-negative" but the name is irrelevant and what counts is the semantic... and <code>unsigned</code> is very far from "non-negative". For this reason when coding most loops on vectors my personally preferred form is: <pre class="prettyprint"><code>for (int i=0,n=v.size(); i<n; i++) { ... } </code></pre> (of course assuming the size of the vector is not changing during the iteration and that I actually need the index in the body as otherwise the <code>for (auto& x : v)...</code> is better). This running away from <code>unsigned</code> as soon as possible and using plain integers has the advantage of avoiding the traps that are a consequence of <code>unsigned size_t</code> design mistake. For example consider: <pre class="prettyprint"><code>// draw lines connecting the dots for (size_t i=0; i<pts.size()-1; i++) { drawLine(pts[i], pts[i+1]); } </code></pre> the code above will have problems if the <code>pts</code> vector is empty because <code>pts.size()-1</code> is a huge nonsense number in that case. Dealing with expressions where <code>a < b-1</code> is not the same as <code>a+1 < b</code> even for commonly used values is like dancing in a minefield. Historically the justification for having <code>size_t</code> unsigned is for being able to use the extra bit for the values, e.g. being able to have 65535 elements in arrays instead of just 32767 on 16-bit platforms. In my opinion even at that time the extra cost of this wrong semantic choice was not worth the gain (and if 32767 elements are not enough now then 65535 won't be enough for long anyway). Unsigned values are great and very useful, but NOT for representing container size or for indexes; for size and index regular signed integers work much better because the semantic is what you would expect. Unsigned values are the ideal type when you need the modulo arithmetic property or when you want to work at the bit level.

Not much difference. One benefit of <code>int</code> is it being signed. Thus <code>int i < 0</code> makes sense, while <code>unsigned i < 0</code> doesn't much. If indexes are calculated, that may be beneficial (for example, you might get cases where you will never enter a loop if some result is negative). And yes, it is less to write :-)

It's purely laziness and ignorance. You should always use the right types for indices, and unless you have further information that restricts the range of possible indices, <code>size_t</code> is the right type. Of course if the dimension was read from a single-byte field in a file, then you know it's in the range 0-255, and <code>int</code> would be a perfectly reasonable index type. Likewise, <code>int</code> would be okay if you're looping a fixed number of times, like 0 to 99. But there's still another reason not to use <code>int</code>: if you use <code>i%2</code> in your loop body to treat even/odd indices differently, <code>i%2</code> is a lot more expensive when <code>i</code> is signed than when <code>i</code> is unsigned...

Why is int rather than unsigned int used for C and C++ for loops?

Tags:

c

for-loop

int

unsigned

This is a rather silly question but why is int commonly used instead of unsigned int when defining a for loop for an array in C or C++?

for(int i;i<arraySize;i++){}
for(unsigned int i;i<arraySize;i++){}

I recognize the benefits of using int when doing something other than array indexing and the benefits of an iterator when using C++ containers. Is it just because it does not matter when looping through an array? Or should I avoid it all together and use a different type such as size_t?

870

asked Sep 20 '11 16:09

Elpezmuerto

4 Answers

Using int is more correct from a logical point of view for indexing an array.

unsigned semantic in C and C++ doesn't really mean "not negative" but it's more like "bitmask" or "modulo integer".

To understand why unsigned is not a good type for a "non-negative" number please consider these totally absurd statements:

Adding a possibly negative integer to a non-negative integer you get a non-negative integer
The difference of two non-negative integers is always a non-negative integer
Multiplying a non-negative integer by a negative integer you get a non-negative result

Obviously none of the above phrases make any sense... but it's how C and C++ unsigned semantic indeed works.

Actually using an unsigned type for the size of containers is a design mistake of C++ and unfortunately we're now doomed to use this wrong choice forever (for backward compatibility). You may like the name "unsigned" because it's similar to "non-negative" but the name is irrelevant and what counts is the semantic... and unsigned is very far from "non-negative".

For this reason when coding most loops on vectors my personally preferred form is:

for (int i=0,n=v.size(); i<n; i++) {
    ...
}

(of course assuming the size of the vector is not changing during the iteration and that I actually need the index in the body as otherwise the for (auto& x : v)... is better).

This running away from unsigned as soon as possible and using plain integers has the advantage of avoiding the traps that are a consequence of unsigned size_t design mistake. For example consider:

// draw lines connecting the dots
for (size_t i=0; i<pts.size()-1; i++) {
    drawLine(pts[i], pts[i+1]);
}

the code above will have problems if the pts vector is empty because pts.size()-1 is a huge nonsense number in that case. Dealing with expressions where a < b-1 is not the same as a+1 < b even for commonly used values is like dancing in a minefield.

Historically the justification for having size_t unsigned is for being able to use the extra bit for the values, e.g. being able to have 65535 elements in arrays instead of just 32767 on 16-bit platforms. In my opinion even at that time the extra cost of this wrong semantic choice was not worth the gain (and if 32767 elements are not enough now then 65535 won't be enough for long anyway).

Unsigned values are great and very useful, but NOT for representing container size or for indexes; for size and index regular signed integers work much better because the semantic is what you would expect.

Unsigned values are the ideal type when you need the modulo arithmetic property or when you want to work at the bit level.

140

answered Nov 04 '22 15:11

6502

This is a more general phenomenon, often people don't use the correct types for their integers. Modern C has semantic typedefs that are much preferable over the primitive integer types. E.g everything that is a "size" should just be typed as size_t. If you use the semantic types systematically for your application variables, loop variables come much easier with these types, too.

And I have seen several bugs that where difficult to detect that came from using int or so. Code that all of a sudden crashed on large matrixes and stuff like that. Just coding correctly with correct types avoids that.

answered Nov 04 '22 16:11

Jens Gustedt

Not much difference. One benefit of int is it being signed. Thus int i < 0 makes sense, while unsigned i < 0 doesn't much.

If indexes are calculated, that may be beneficial (for example, you might get cases where you will never enter a loop if some result is negative).

And yes, it is less to write :-)

answered Nov 04 '22 15:11

littleadv

It's purely laziness and ignorance. You should always use the right types for indices, and unless you have further information that restricts the range of possible indices, size_t is the right type.

Of course if the dimension was read from a single-byte field in a file, then you know it's in the range 0-255, and int would be a perfectly reasonable index type. Likewise, int would be okay if you're looping a fixed number of times, like 0 to 99. But there's still another reason not to use int: if you use i%2 in your loop body to treat even/odd indices differently, i%2 is a lot more expensive when i is signed than when i is unsigned...

answered Nov 04 '22 16:11

R.. GitHub STOP HELPING ICE

Related questions
                            
                                ISO/IEC Website and Charging for C and C++ Standards
                            
                                malloc vs mmap in C
                            
                                "fatal error: bits/libc-header-start.h: No such file or directory" while compiling HTK
                            
                                What is the difference between ssize_t and ptrdiff_t?
                            
                                Can FFmpeg be used as a library, instead of a standalone program?
                            
                                Explanation of a pointer in exploit code
                            
                                Why does left shift operation invoke Undefined Behaviour when the left side operand has negative value?
                            
                                What are the valid characters for macro names?
                            
                                Stack allocation, padding, and alignment
                            
                                Why is a pthread mutex considered "slower" than a futex?
                            
                                how to assign multiple values into a struct at once?
                            
                                What is -ffreestanding option in gcc?
                            
                                comparing int with size_t
                            
                                What is activation record in the context of C and C++?
                            
                                Purpose of LDA argument in BLAS dgemm?
                            
                                Elegantly call C++ from C
                            
                                Is NULL in C required/defined to be zero?
                            
                                Is there any overhead for using variable-length arrays?
                            
                                difference between <stdlib.h> and <malloc.h>
                            
                                Operation on ... may be undefined?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With