I wrote a small program to check how many bytes char occupies in my memory and it shows char actually occupies 4 bytes in memory. I understand it's mostly because of word alignment and don't see advantage of a char being only 1 byte. Why not use 4 bytes for char?
int main(void)
{
int a;
char b;
int c;
a = 0;
b = 'b';
c = 1;
printf("%p\n",&a);
printf("%p\n",&b);
printf("%p\n",&c);
return 0;
}
Output: 0x7fff91a15c58 0x7fff91a15c5f 0x7fff91a15c54
Update: I don't believe that malloc will allocate only 1 byte for char, even though sizeof(char) is passed as argument because, malloc contains a header will makes sure that header is word aligned. Any comments?
Update2: If you are asked to effectively use memory without padding, is the only way is to create a special memory allocator? or is it possible to disable padding?
Because in C the character constant is an int type. Last is float value so 4 bytes.
The char type takes 1 byte of memory (8 bits) and allows expressing in the binary notation 2^8=256 values.
To store character value, computer will allocate 1 byte (8 bit) memory. 65 will converted into binary form which is (1000001) 2. Because computer knows only binary number system. Then 1000001 will be stored in 8-bit memory.
Let's look at your output for printing the addresses of a, b, and c:
Output: 0x7fff91a15c58 0x7fff91a15c5f 0x7fff91a15c54
Notice that b isn't on the same 4 byte boundary? And that a and c are next to each other? Here is what it looks like in memory, with each row taking up 4 bytes, and the rightmost column being the 0th place:
| b | x | x | x | 0x5c5c
-----------------
| a | a | a | a | 0x5c58
-----------------
| c | c | c | c | 0x5c54
This is the compilers way of optimizing space and keeping things word aligned. Even though your address of b is 0x5c5f, it isn't actually taking up 4 bytes. If you take your same code and add a short d, you'll see this:
| b | x | d | d | 0x5c5c
-----------------
| a | a | a | a | 0x5c58
-----------------
| c | c | c | c | 0x5c54
Where the address of d is 0x5c5c. Shorts are going to be aligned to two bytes, so you will still have one byte of unused memory between c and d. Add in another char e, and you'll get:
| b | e | d | d | 0x5c5c
-----------------
| a | a | a | a | 0x5c58
-----------------
| c | c | c | c | 0x5c54
Here's my code and the output. Please note that my addresses will differ slightly, but it's the least significant digit in the address that we're really concerned about anyway:
int main(void)
{
int a;
char b;
int c;
short d;
char e;
a = 0;
b = 'b';
c = 1;
printf("%p\n",&a);
printf("%p\n",&b);
printf("%p\n",&c);
printf("%p\n",&d);
printf("%p\n",&e);
return 0;
}
$ ./a.out
0xbfa0bde8
0xbfa0bdef
0xbfa0bde4
0xbfa0bdec
0xbfa0bdee
The man page of malloc says that it "allocates size bytes and returns a pointer to the allocated memory." It also says that it will "return a pointer to the allocated memory, which is suitably aligned for any kind of variable". From my testing, repeated calls to malloc(1) are returning addresses in "double word" increments, but I wouldn't count on this.
My code was ran on an x86 32-bit machine. Other machines might vary slightly, and some compilers may optimize in different ways, but the ideas should hold true.
You have int, char, int
See the image here under "Why Restrict Byte Alignment?" http://www.eventhelix.com/realtimemantra/ByteAlignmentAndOrdering.htm
Byte 0 Byte 1 Byte 2 Byte 3
0x1000
0x1004 X0 X1 X2 X3
0x1008
0x100C Y0 Y1 Y2
If it had stored them in 4-byte, 1-byte and 4-byte form, it would have taken 2 cpu cycles to retrieve int c
and some bit-shifting to get the actual value of c aligned properly for use as an int.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With