Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C Question: why char actually occupies 4 bytes in memory?

I wrote a small program to check how many bytes char occupies in my memory and it shows char actually occupies 4 bytes in memory. I understand it's mostly because of word alignment and don't see advantage of a char being only 1 byte. Why not use 4 bytes for char?

int main(void)
{
  int a;
  char b;
  int c;
  a = 0;
  b = 'b';
  c = 1;
  printf("%p\n",&a);
  printf("%p\n",&b);
  printf("%p\n",&c);
  return 0;
}

Output: 0x7fff91a15c58 0x7fff91a15c5f 0x7fff91a15c54

Update: I don't believe that malloc will allocate only 1 byte for char, even though sizeof(char) is passed as argument because, malloc contains a header will makes sure that header is word aligned. Any comments?

Update2: If you are asked to effectively use memory without padding, is the only way is to create a special memory allocator? or is it possible to disable padding?

like image 786
Boolean Avatar asked Mar 01 '11 04:03

Boolean


People also ask

Why is a char 4 bytes?

Because in C the character constant is an int type. Last is float value so 4 bytes.

Is a char 4 bytes?

The char type takes 1 byte of memory (8 bits) and allows expressing in the binary notation 2^8=256 values.

How are chars stored in memory in C?

To store character value, computer will allocate 1 byte (8 bit) memory. 65 will converted into binary form which is (1000001) 2. Because computer knows only binary number system. Then 1000001 will be stored in 8-bit memory.


2 Answers

Alignment

Let's look at your output for printing the addresses of a, b, and c:

Output: 0x7fff91a15c58 0x7fff91a15c5f 0x7fff91a15c54

Notice that b isn't on the same 4 byte boundary? And that a and c are next to each other? Here is what it looks like in memory, with each row taking up 4 bytes, and the rightmost column being the 0th place:

| b | x | x | x | 0x5c5c
-----------------
| a | a | a | a | 0x5c58 
-----------------
| c | c | c | c | 0x5c54 

This is the compilers way of optimizing space and keeping things word aligned. Even though your address of b is 0x5c5f, it isn't actually taking up 4 bytes. If you take your same code and add a short d, you'll see this:

| b | x | d | d | 0x5c5c
-----------------
| a | a | a | a | 0x5c58 
-----------------
| c | c | c | c | 0x5c54 

Where the address of d is 0x5c5c. Shorts are going to be aligned to two bytes, so you will still have one byte of unused memory between c and d. Add in another char e, and you'll get:

| b | e | d | d | 0x5c5c
-----------------
| a | a | a | a | 0x5c58 
-----------------
| c | c | c | c | 0x5c54 

Here's my code and the output. Please note that my addresses will differ slightly, but it's the least significant digit in the address that we're really concerned about anyway:

int main(void)
{
  int a;
  char b;
  int c;
  short d;
  char e;
  a = 0;
  b = 'b';
  c = 1;
  printf("%p\n",&a);
  printf("%p\n",&b);
  printf("%p\n",&c);
  printf("%p\n",&d);
  printf("%p\n",&e);
  return 0;
}

$ ./a.out 
0xbfa0bde8
0xbfa0bdef
0xbfa0bde4
0xbfa0bdec
0xbfa0bdee

Malloc

The man page of malloc says that it "allocates size bytes and returns a pointer to the allocated memory." It also says that it will "return a pointer to the allocated memory, which is suitably aligned for any kind of variable". From my testing, repeated calls to malloc(1) are returning addresses in "double word" increments, but I wouldn't count on this.

Caveats

My code was ran on an x86 32-bit machine. Other machines might vary slightly, and some compilers may optimize in different ways, but the ideas should hold true.

like image 73
Jeff Avatar answered Nov 04 '22 01:11

Jeff


You have int, char, int

See the image here under "Why Restrict Byte Alignment?" http://www.eventhelix.com/realtimemantra/ByteAlignmentAndOrdering.htm

          Byte 0 Byte 1 Byte 2  Byte 3
0x1000               
0x1004  X0     X1     X2      X3
0x1008               
0x100C         Y0     Y1      Y2

If it had stored them in 4-byte, 1-byte and 4-byte form, it would have taken 2 cpu cycles to retrieve int c and some bit-shifting to get the actual value of c aligned properly for use as an int.

like image 40
RichardTheKiwi Avatar answered Nov 03 '22 23:11

RichardTheKiwi