Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected sign extension of int32 or 32bit pointer when converted to uint64

I compiled this code using Visual Studio 2010 (cl.exe /W4) as a C file:

int main( int argc, char *argv[] )
{
    unsigned __int64 a = 0x00000000FFFFFFFF;
    void *orig = (void *)0xFFFFFFFF;
    unsigned __int64 b = (unsigned __int64)orig;
    if( a != b )
        printf( " problem\ta: %016I64X\tb: %016I64X\n", a, b );
    return;
}

There are no warnings and the result is:

problem a: 00000000FFFFFFFF b: FFFFFFFFFFFFFFFF

I suppose int orig = (int)0xFFFFFFFF would be less controversial as I'm not assigning a pointer to an integer. However the result would be the same.

Can someone explain to me where in the C standard it is covered that orig is sign extended from 0xFFFFFFFF to 0xFFFFFFFFFFFFFFFF?

I had assumed that (unsigned __int64)orig would become 0x00000000FFFFFFFF. It appears that the conversion is first to the signed __int64 type and then it becomes unsigned?

EDIT: This question has been answered in that pointers are sign extended which is why I see this behavior in gcc and msvc. However I don't understand why when I do something like (unsigned __int64)(int)0xF0000000 it sign extends to 0xFFFFFFFFF0000000 but (unsigned __int64)0xF0000000 does not instead showing what I want which is 0x00000000F0000000.

EDIT: An answer to the above edit. The reason that (unsigned __int64)(int)0xF0000000 is sign extended is because, as noted by user R:

Conversion of a signed type (or any type) to an unsigned type always takes place via reduction modulo one plus the max value of the destination type.

And in (unsigned __int64)0xF0000000 0xF0000000 starts off as an unsigned integer type because it cannot fit in an integer type. Next that already unsigned type is converted unsigned __int64.

So the takeaway from this for me is with a function that's returning a 32-bit or 64-bit pointer as an unsigned __int64 to compare I must first convert the 32-bit pointer in my 32-bit application to an unsigned type before promoting to unsigned __int64. The resulting code looks like this (but, you know, better):

unsigned __int64 functionidontcontrol( char * );
unsigned __int64 x;
void *y = thisisa32bitaddress;
x = functionidontcontrol(str);
if( x != (uintptr_t)y )



EDIT again: Here is what I found in the C99 standard: 6.3.1.3 Signed and unsigned integers

  • 1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
  • 2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.49)
  • 3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
  • 49) The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
like image 908
loop Avatar asked Feb 23 '23 19:02

loop


2 Answers

Converting a pointer to/from an integer is implementation defined.

Here is how gcc does it, i.e. it sign extends if the integer type is larger than the pointer type(this'll happen regardless of the integer being signed or unsigned, just because that's how gcc decided to implement it).

Presumably msvc behaves similar. Edit, the closest thing I can find on MSDN is this/this, suggesting that converting 32 bit pointers to 64 bit also sign extends.

like image 95
nos Avatar answered Apr 27 '23 01:04

nos


From the C99 standard (§6.3.2.3/6):

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.

So you'll need to find your compiler's documentation that talks about that.

like image 22
Mat Avatar answered Apr 27 '23 00:04

Mat