Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PHP - unpack as unsigned int

Tags:

php

How do I convert a binary string to an unsigned int?

I'm doing

$id = unpack('V', substr($dir, $mid * 12, 4))[1];
echo $id . '<br/>';

Where V, according to documentation, is

unsigned long (always 32 bit, little endian byte order)

And it prints -992455690. How is this possible?

Update: found this in the documentation:

Note that PHP internally stores integral values as signed. If you unpack a large unsigned long and it is of the same size as PHP internally stored values the result will be a negative number even though unsigned unpacking was specified.

So now the question is, what's the point of the V format if its identical to the signed version, other than to create confusion?

like image 953
riv Avatar asked Jul 29 '15 11:07

riv


2 Answers

Unfortunately, for 32-bit PHP distributions, as yours appears to be, and where the native machine endianness is intel 'little-endian', the answer seems to be: there is not much point in having a separate V format as opposed to l!

For 64-bit distributions of PHP, the V format allows the developer to retrieve useful information if decoding binary strings encoded in a system/language that does allow access to 4-byte usigned values - however your question then translates upwards to being 'What is the point of having a P format for 64-bit machines if identical to the signed version.'

Other than this, I would suggest that the format used also provides useful information to any reading developer as to the intent of the code, as (if translated back into a binary string using pack()) the binary representation of the negative number will be the same as the original unsigned number. If the developer knows that they will be handling integers in the extreme range of the allowed values then they should know that in order to be able to (manually) handle overflow/underflow cases correctly.

Additionally, and possibly worthy of note, the manual for pack that gives these codes only gives one code each for signed values' formats which are always machine byte-order, whereas the unsigned values' formats allow specifying a particular byte-order in addition to the machine order. If, therefore, you wished to decode a signed 16-bit value natively stored on ARM (big-endian) on a 32-bit php distribution on intel(little-endian) without manipulating the byte order, you would have to first decode as unsigned 16-bit big-endian, and then manually subtract 2^15 if the resulting number was over this value.

like image 100
Benjamin Avatar answered Oct 27 '22 11:10

Benjamin


I think unpack works for you need. If it not works, try the code bellow. As is little endian order, I use the ord to calculate the ascii of each byte of the 32bit int var, then calculate the int value.

$chars = substr($dir, $mid * 12, 4);
return ord($chars[0]) + ord($chars[1]) << 8 + ord($chars[2]) << 16 + ord($chars[3]) << 24.
like image 38
LF00 Avatar answered Oct 27 '22 11:10

LF00