Can anyone please explain the usage of the character constant \000 and \xhh ie octal numbers and hexadecimal numbers in a character constant?
A "character constant" is formed by enclosing a single character from the representable character set within single quotation marks (' '). Character constants are used to represent characters in the execution character set.
In C, strings are terminated by a character with the value zero (0).
String Literals. A String Literal, also known as a string constant or constant string, is a string of characters enclosed in double quotes, such as "To err is human - To really foul things up requires a computer." String literals are stored in C as an array of chars, terminted by a null byte.
W8098 Multi-character character constant (C++)This warning is issued when the compiler detects a multi-character integer constant, such as: int foo = 'abcd'; The problem with this construct is that the byte order of the characters is implementation dependent.
In C, strings are terminated by a character with the value zero (0). This could be written like this:
char zero = 0;
but this doesn't work inside strings. There is a special syntax used in string literals, where the backslash works as an escape sequence introduction, and is followed by various things.
One such sequence is "backslash zero", that simply means a character with the value zero. Thus, you can write things like this:
char hard[] = "this\0has embedded\0zero\0characters";
Another sequence uses a backslash followed by the letter 'x'
and one or two hexadecimal digits, to represent the character with the indicated code. Using this syntax, you could write the zero byte as '\x0'
for instance.
EDIT: Re-reading the question, there's also support for such constants in base eight, i.e. octal. They use a backslash followed by the digit zero, just as octal literal integer constants. '\00'
is thus a synonym for '\0'
.
This is sometimes useful when you need to construct a string containing non-printing characters, or special control characters.
There's also a set of one-character "named" special characters, such as '\n'
for newline, '\t'
for TAB, and so on.
Those would be used to write otherwise nonprintable characters in the editor. For standard chars, that would be the various control characters, for wchar it could be characters not represented in the editor font.
For instance, this compiles in Visual Studio 2005:
const wchar_t bom = L'\xfffe'; /* Unicode byte-order marker */
const wchar_t hamza = L'\x0621'; /* Arabic Letter Hamza */
const char start_of_text = '\002'; /* Start-of-text */
const char end_of_text = '\003'; /* End-of-text */
Edit: Using octal character literals has an interesting caveat. Octal numbers can apparantly not be more than three digits long, which artificially restricts the characters we can enter.
For instance:
/* Letter schwa; capital unicode code point 0x018f (octal 0617)
* small unicode code point 0x0259 (octal 1131)
*/
const wchar_t Schwa2 = L'\x18f'; /* capital letter Schwa, correct */
const wchar_t Schwa1 = L'\617'; /* capital letter Schwa, correct */
const wchar_t schwa1 = L'\x259'; /* small letter schwa, correct */
const wchar_t schwa2 = L'\1131'; /* letter K (octal 113), incorrect */
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With