I am confused by these four terms:
character string literal
character constants
string literal.
multibyte character sequence
And reading this quote in C Standard:
A character string literal need not be a string (see 7.1.1), because a null character may be embedded in it by a
\0
escape sequence.
What is meant by the first part ?
A string-literal is
"abc"
;u8"abc"
;L"abc"
.From the standard (emphasis mine):
A character string literal is a sequence of zero or more multibyte characters enclosed in double-quotes, as in
"xyz"
. A UTF−8 string literal is the same, except prefixed by u8. A wide string literal is the same, except prefixed by the letterL
,u
, orU
.
....
In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. 78)78) A string literal need not be a string (see 7.1.1), because a null character may be embedded in it by a \0 escape sequence.
A string is a contiguous sequence of characters terminated by and including the first null character.
So a string literal may have \0
also in the middle or even at the beginning, for instance "a\0b"
or "\0ab"
. I think this is what the footnote is saying.
A character constant is a c-char-sequence (usually a single character) in single quotes, with a possible prefix L
/u
/U
.
An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in
'x'
. A wide character constant is the same, except prefixed by the letterL
,u
, orU
.
So the terminology is not very symmetric, IMO. E.g. wide character constant is a particular case of character constant. However both character string literal and wide string literal belong to string literals.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With