Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C/C++ sizeof operator: Why does sizeof( 'a' ) return different values? [duplicate]

Tags:

c++

c

memory

Possible Duplicate:
Size of character ('a') in C/C++

I am a beginner at C, and was confused by this.

C: I tried printing the sizeof( 'a' ) in C using the "%zu" modifier, and it prints a value 4.

C++: Printing sizeof( 'a' ) in C++ using cout, and printf(using the format above) both printed a value 1.

I believe the correct value should be 1, since 'a' will be taken as a char. Why doesn't it return 4 in C? Are the sizeof operations of both different in both the languages? If so, what's the difference, and why does it return a different value? I used the gcc compilers in both cases.

like image 410
Khushman Patel Avatar asked May 23 '12 03:05

Khushman Patel


People also ask

What does sizeof operator return in C?

sizeof() operator is used in different way according to the operand type. 1. When operand is a Data Type. When sizeof() is used with the data types such as int, float, char… etc it simply returns the amount of memory is allocated to that data types.

Why is sizeof () an operator and not a function?

sizeof operator is compile time entity not runtime and don't need parenthesis like a function. When code is compiled then it replace the value with the size of that variable at compile time but in function after function gets execute then we will know the returning value.

What is the return type of sizeof () operator?

The sizeof operator is used to get the size of types or variable in bytes. Returns an unsigned integer type of at least 16 bit.

What is the use of sizeof () in C?

The sizeof() function in C is a built-in function that is used to calculate the size (in bytes)that a data type occupies in ​the computer's memory. A computer's memory is a collection of byte-addressable chunks.


2 Answers

In C, the 'a' is a character constant, which is treated as an integer, so you get a size of 4, whereas in C++ it's treated as a char. This is a duplicate of the question here:

Size of character ('a') in C/C++

like image 158
BlackJack Avatar answered Sep 30 '22 05:09

BlackJack


In C a character literal (constant) has type int. So, consider the following program

#include <stdio.h>

main(int argc, char *argv[])
{
  printf("%zu\n", sizeof('a'));
  printf("%zu\n", sizeof('ab'));
  printf("%zu\n", sizeof('abc'));
  printf("%zu\n", sizeof('abcd'));

  printf("%u\n", 'a');
  printf("%u\n", 'ab');
  printf("%u\n", 'abc');
  printf("%u\n", 'abcd');

  printf("%x\n", 'a');
  printf("%x\n", 'ab');
  printf("%x\n", 'abc');
  printf("%x\n", 'abcd');

  printf("%c\n", 'a');
  printf("%c\n", 'ab');
  printf("%c\n", 'abc');
  printf("%c\n", 'abcd');
}

The first four statements all consider the literals as one character constant and they all print 4 == sizef(int), at least on gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3. Note that this compiler prints several warnings for the above program:

warning: multi-character character constant

Basically, a character literal specifies the four bytes making up an int, from left to right, higher-order byte first. The missing leading bytes are filled with 0. So, on my machine the second and third group of printf statements print

97
24930
6382179
1633837924
61
6162
616263
61626364

In the hexadecimal output you see the layout of the four characters in the literal (the ASCII codes from left to right): the 'a' is mapped to the highest-order byte 0x61).

Finally, the fourth group prints:

a
b
c
d

i.e. the character literals are pushed on the stack as integers, but printf only prints the lowest byte of that int as a char.

C++ behaves in a similar way, but one-byte character literals are considered of type char, not int. The program

#include <iostream>

using namespace std;

main(int argc, char *argv[])
{
  cout << sizeof('a') << endl;
  cout << sizeof('ab') << endl;
  cout << sizeof('abc') << endl;
  cout << sizeof('abcd') << endl;

  cout << 'a' << endl;
  cout << 'ab' << endl;
  cout << 'abc' << endl;
  cout << 'abcd' << endl;
}

will compile using GCC and give a similar warning. Its output is different from that of C:

1
4
4
4
a
24930
6382179
1633837924

So one-byte character literals are treated as char, while multi-byte literals are treated as int.

IMPORTANT NOTE

I ran my tests on a 32-bit Linux system on which an int has 4 bytes. It would be interesting to see what happens on other systems, e.g. on a 64-bit system.

EDIT

Fixed answer (thanks for the hint): character literals have type int in C, they are not cast to int.

like image 37
Giorgio Avatar answered Sep 30 '22 07:09

Giorgio