Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Need clarification about unsigned char * in C

Tags:

c

Given the code:

...

int x = 123

...

unsigned char * xx = (char *) & x;

...

I have xx[0] = 123, xx[1] = 0, xx[2] = 0, etc.

Can someone explain what is happening here? I dont have a great understanding of pointers in general, so the simpler the better.

Thanks

like image 768
123454321 Avatar asked Jan 26 '26 04:01

123454321


2 Answers

You're accessing the bytes (chars) of a little-endian int in sequence. The number 123 in an int on a little-endian system will usually be stored as {123,0,0,0}. If your number had been 783 (256 * 3 + 15), it would be stored as {15,3,0,0}.

like image 150
MooseBoys Avatar answered Jan 27 '26 19:01

MooseBoys


I'll try to explain all the pieces in ASCII pictures.

int x = 123;

Here, x is the symbol representing a location of type int. Type int uses 4 bytes of memory on a 32-bit machine, or 8 bytes on a 64-bit machine. This can be compiler dependent as well. But for this discussion, let's assume 32-bits (4 bytes).

Memory on x86 is managed "little endian", meaning if a number requires multiple bytes (it's value is > 255 unsigned, or > 127 signed, single byte values), then the number is stored with the least significant byte in the lowest address. If your number were hexadecimal, 0x12345678, then it would be stored as:

x: 78        <-- address that `x` represents
   56        <-- x addr + 1 byte
   34        <-- x addr + 2 bytes
   12        <-- x addr + 3 bytes

Your number, decimal 123, is 7B hex, or 0000007B (all 4 bytes shown), so would look like:

x: 7B        <-- address that `x` represents
   00        <-- x addr + 1 byte
   00        <-- x addr + 2 bytes
   00        <-- x addr + 3 bytes

To make this clearer, let's make up a memory address for x, say, 0x00001000. Then the byte locations would have the following values:

    Address   Value
 x: 00001000  7B
    00001001  00
    00001002  00
    00001003  00

Now you have:

unsigned char * xx = (char *) & x;

Which defines a pointer to an unsigned char (an 8-bit, or 1-byte unsigned value, ranging 0-255) whose value is the address of your integer x. In other words, the value contained at location xx is 0x00001000.

xx:  00
     10
     00
     00

The ampersand (&) indicates you want the address of x. And, technically, the declaration isn't correct. It really should be cast properly as:

unsigned char * xx = (unsigned char *) & x;

So now you have a pointer, or address, stored in the variable xx. That address points to x:

    Address   Value
 x: 00001000  7B      <-- xx points HERE (xx has the value 0x00001000)
    00001001  00
    00001002  00
    00001003  00

The value of xx[0] is what xx points to offset by 0 bytes. It's offset by bytes because the type of xx is a pointer to an unsigned char which is one byte. Therefore, each offset count from xx is by the size of that type. The value of xx[1] is just one byte higher in memory, which is the value 00. And so on. Pictorially:

    Address   Value
 x: 00001000  7B      <-- xx[0], or the value at `xx` + 0
    00001001  00      <-- xx[1], or the value at `xx` + 1
    00001002  00      <-- xx[2], or the value at `xx` + 2
    00001003  00      <-- xx[3], or the value at `xx` + 3
like image 34
lurker Avatar answered Jan 27 '26 19:01

lurker



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!