Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference in results when using int and size_t

I was reading an article on usage of size_t and ptrdiff_t data types here, when I came across this example:

enter image description here

The code:

int A = -2;
unsigned B = 1;
int array[5] = { 1, 2, 3, 4, 5 };
int *ptr = array + 3;
ptr = ptr + (A + B); //Error
printf("%i\n", *ptr);

I am unable to understand a couple of things. First, how can adding a signed and an unsigned number cast the enter result into unsigned type? If the result is indeed 0xFFFFFFFF of unsigned type, why in a 32 bit system, while adding it with ptr, will it be interpreted as ptr-1, given that the number is actually unsigned type and the leading 1 should not signify sign?

Second, why is the result different in 64 bit system?

Can anyone explain this please?

like image 575
SexyBeast Avatar asked Nov 15 '14 21:11

SexyBeast


People also ask

How is Size_t different from int?

If we consider the standard, both are integers of size 16 bits. On a typical 64-bit system, the size_t will be 64-bit, but unsigned int will be 32 bit. So we cannot use them interchangeably. One standard recommendation is that the size_t be at most as big as an unsigned long.

Is Size_t more efficient than int?

In short, size_t is never negative, and it maximizes performance because it's typedef'd to be the unsigned integer type that's big enough -- but not too big -- to represent the size of the largest possible object on the target platform. Sizes should never be negative, and indeed size_t is an unsigned type.

Is Size_t always positive?

The size_t data type is never negative. Therefore many C library functions like malloc, memcpy and strlen declare their arguments and return type as size_t. For instance, // Declaration of various standard library functions.

Is Size_t always unsigned int?

No. size_t can and does differ from unsigned int . Per the C standard, 6.5.


2 Answers

1. I am unable to understand a couple of things. First, how can adding a signed and an unsigned number cast the enter result into unsigned type?

This is defined by integer promotions and integer conversion rank.

6.3.1.8 p1: Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

In this case unsigned has a higher rank than int, therefore int is promoted to unsigned.

The conversion of int ( -2 ) to unsigned is performed as described:

6.3.1.3 p2: Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type

2. If the result is indeed 0xFFFFFFFF of unsigned type, why in a 32 bit system, while adding it with ptr, will it be interpreted as ptr-1, given that the number is actually unsigned type and the leading 1 should not signify sign?

This is undefined behavior and should not be relied on, since C doesn't define pointer arithmetic overflow.

6.5.6 p8: If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

3. Second, why is the result different in 64 bit system?

( This assumes( as does the picture ) that int and unsigned are 4 bytes. )

The result of A and B is the same as described in 1., then that result is added to the pointer. Since the pointer is 8 bytes and assuming the addition doesn't overflow( it still could if ptr had a large address, giving the same undefined behavior as in 2. ) the result is an address.

This is undefined behavior because the pointer points way outside of the bounds of the array.

like image 80
2501 Avatar answered Sep 19 '22 13:09

2501


The operands of the expression A + B are subject to usual arithmetic conversion, covered in C11 (n1570) 6.3.1.8 p1:

[...]

Otherwise, the integer promotions [which leave int and unsigned int unchanged] are performed on both operands. Then the following rules are applied to the promoted operands:

  • If both operands have the same type, [...]
  • Otherwise, if both operands have signed integer types or both have unsigned integer types, [...]
  • Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
  • [...]

The types int and unsigned int have the same rank (ibid. 6.3.1.1 p1, 4th bullet); the result of the addition has type unsigned int.

On 32-bit systems, int and pointers usually have the same size (32 bit). From a hardware-centric point of view (and assuming 2's complement), subtracting 1 and adding -1u is the same (addition for signed and unsigned types is the same!), so the access to the array element appears to work.

However, this is undefined behaviour, as array doesn't contain a 0x100000003rd element.

On 64-bit, int usually has still 32 bit, but pointers have 64 bit. Thus, there is no wraparound and no equivalence to subtracting 1 (from a hardware-centric point of view, the behaviour is undefined in both cases).

To illustrate, say ptr is 0xabcd0123, adding 0xffffffff yields

  abcd0123
+ ffffffff

 1abcd0122
 ^-- The 1 is truncated for a 32-bit calculation, but not for 64-bit.
like image 29
mafso Avatar answered Sep 21 '22 13:09

mafso