Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multicharacter literal in C and C++

I didn't know that C and C++ allow multicharacter literal: not 'c' (of type int in C and char in C++), but 'tralivali' (of type int!)

enum {     ActionLeft = 'left',     ActionRight = 'right',     ActionForward = 'forward',     ActionBackward = 'backward' }; 

Standard says:

C99 6.4.4.4p10: "The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined."

I found they are widely used in C4 engine. But I suppose they are not safe when we are talking about platform-independend serialization. Thay can be confusing also because look like strings. So what is multicharacter literal's scope of usage, are they useful for something? Are they in C++ just for compatibility with C code? Are they considered to be a bad feature as goto operator or not?

like image 997
topright gamedev Avatar asked Oct 18 '10 16:10

topright gamedev


People also ask

What is a character literal in C?

A character literal contains a sequence of characters or escape sequences enclosed in single quotation mark symbols, for example 'c' . A character literal may be prefixed with the letter L, for example L'c' .

Is the size of character literal different in C and C?

8. Is the size of character literals different in C and C++? Explanation: In C++, sizeof('a') == sizeof(char) == 1. In C however, sizeof('a') == sizeof(int).

Is 1 a char literal?

In C++, sizeof('a') == sizeof(char) == 1 . This makes intuitive sense, since 'a' is a character literal, and sizeof(char) == 1 as defined by the standard.


2 Answers

It makes it easier to pick out values in a memory dump.

Example:

enum state { waiting, running, stopped }; 

vs.

enum state { waiting = 'wait', running = 'run.', stopped = 'stop' }; 

a memory dump after the following statement:

s = stopped; 

might look like:

00 00 00 02 . . . . 

in the first case, vs:

73 74 6F 70 s t o p 

using multicharacter literals. (of course whether it says 'stop' or 'pots' depends on byte ordering)

like image 123
Ferruccio Avatar answered Sep 22 '22 13:09

Ferruccio


I don't know how extensively this is used, but "implementation-defined" is a big red-flag to me. As far as I know, this could mean that the implementation could choose to ignore your character designations and just assign normal incrementing values if it wanted. It may do something "nicer", but you can't rely on that behavior across compilers (or even compiler versions). At least "goto" has predictable (if undesirable) behavior...

That's my 2c, anyway.

Edit: on "implementation-defined":

From Bjarne Stroustrup's C++ Glossary:

implementation defined - an aspect of C++'s semantics that is defined for each implementation rather than specified in the standard for every implementation. An example is the size of an int (which must be at least 16 bits but can be longer). Avoid implementation defined behavior whenever possible. See also: undefined. TC++PL C.2.

also...

undefined - an aspect of C++'s semantics for which no reasonable behavior is required. An example is dereferencing a pointer with the value zero. Avoid undefined behavior. See also: implementation defined. TC++PL C.2.

I believe this means the comment is correct: it should at least compile, although anything beyond that is not specified. Note the advice in the definition, also.

like image 40
Nick Avatar answered Sep 21 '22 13:09

Nick