Can anyone please explain the below code and please also explain the role of backslash( \
) in such situations. And what \'
, \"
, \ooo
, \ \
, \?
means?
#include <stdio.h>
int main(){
char a = '\010';
char y = '010';
printf("%d,%d",a,y);
return 0;
}
output: 8,48
This '\010'
is a octal escape sequence 10
in octal is 8
in decimal and it will be promoted to an int
when calling printf
so that explains that value.
This '010'
is a multi-character constant and it's value is implementation defined, if we look at the C99 draft standard section 6.4.4.4
Character constants paragraph 10 says(emphasis mine):
[...]The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.[...]
and if you were using gcc
you would have seen at least this warning:
warning: multi-character character constant [-Wmultichar]
and probably this warning as well on overflow:
warning: overflow in implicit constant conversion [-Woverflow]
the value that y
obtains is a little more interesting since character constant has an integer value it can not just be taking the first character, the multi-character constant has to take an integer value and then be converted to char. clang
helpfully provides a more detailed warning:
warning: implicit conversion from 'int' to 'char' changes value from 3158320 to 48 [-Wconstant-conversion]
and current versions of gcc
produces the same value, as we can see from this simple piece of code:
printf("%d\n",'010');
so where does 3158320
comes from? For gcc
at least, if we look at the documentation for Implementation-defined behavior it says:
The compiler evaluates a multi-character character constant a character at a time, shifting the previous value left by the number of bits per target character, and then or-ing in the bit-pattern of the new character truncated to the width of a target character. The final bit-pattern is given type int, and is therefore signed, regardless of whether single characters are signed or not (a slight change from versions 3.1 and earlier of GCC). If there are more characters in the constant than would fit in the target int the compiler issues a warning, and the excess leading characters are ignored.
if we perform the operation(assuming 8-bit char) document above we see:
48*2^16 + 49*2^8 + 48 = 3158320
^ ^
| decimal value of ASCII '1'
decimal value of ASCII '0'
gcc
will convert the int
to char
using modulus 2^8
regardless of whether char
is signed or unsigned which effectively leaves us with the last 8
bits or 48
.
It is an escape sequence to remove meaning of some reserved character such as '
or to specify some special character such as new-line '\n'
or in this case a character with specific ASCII value:
char a = '\010';
defines a character with octal ASCII value 108, i.e. decimal value 810.
char y = '010';
defines a multi-byte character, which should be assigned to wide char, not char
. Although the behavior of this assignment is not defined, in this case it will most likely cause the last character being stored, in y
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With