Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is every "normal" use of user-defined literals undefined behavior?

User defined literals must start with an underscore.

This is a more or less universally well-known rule that you can find on every layman-worded site talking about user literals. It is also a rule which I (and possibly others?) have been blatantly ignoring ever since on a "what a bullshit" base. Now of course, that's strictly not correct. In the strictest sense, this uses a reserved identifier, and thus invokes Undefined Behavior (although you don't get as much as a shrug from the compiler, practically).

So, pondering whether I should continue to deliberately ignore that (in my opinion useless) part of the standard or not, I decided to look at what's actually written. Because, you know, what does it matter what everybody knows. What matters is what's written in the standard.

[over.literal] states that "some" literal suffix identifiers are reserved, linking to [usrlit.suffix]. The latter states that all are reserved, except those that start with an underscore. OK, so that's pretty much exactly what we already knew, explicitly written (or rather, written backwards).

Also, [over.literal] contains a Note which hints to an obvious but troubling thing:

except for the constraints described above, they are ordinary namespace-scope functions and function templates

Well, sure they are. Nowhere does it say that they aren't, so what else would you expect them to be.

But wait a moment. [lex.name] explicitly states that each identifier that begins with an underscore in the global namespace is reserved.

Now, a literal operator usually, unless you explicitly put it into a namespace (which, I believe nobody does!?) is very much in the global namespace. So, the name, which must begin with an underscore, is reserved. There is no mention of a special exception. So, every name (with underscore, or without) is a reserved name.

Are you indeed expected to put user defined literals into a namespace because the "normal" usage (underscore or not) is using a reserved name?

like image 765
Damon Avatar asked Dec 04 '19 16:12

Damon


People also ask

What are user defined literals?

In a raw user-defined literal, the operator that you define accepts the literal as a sequence of char values. It's up to you to interpret that sequence as a number or string or other type. In the list of operators shown earlier in this page, _r and _t can be used to define raw literals: C++ Copy.

How do you define a literal in C++?

Literals are data used for representing fixed values. They can be used directly in the code. For example: 1 , 2.5 , 'c' etc. Here, 1 , 2.5 and 'c' are literals.


2 Answers

This is a good question, and I'm not sure about the answer, but I think the answer is "no, it's not UB" based on a particular reading of the standard.

[lex.name]/3.2 reads:

Each identifier that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

Now, clearly, the restriction "as a name in the global namespace" should be read as applying to the entire rule, not just to how the implementation may use the name. That is, its meaning is not

"each identifier that begins with an underscore is reserved to the implementation, AND the implementation may use such identifiers as names in the global namespace"

but rather,

"the use of any identifier that begins with an underscore as a name in the global namespace is reserved to the implementation".

(If we believed the first interpretation, then it would mean that no one could declare a function called my_namespace::_foo, for example.)

Under the second interpretation, something like a global declaration of operator""_foo (in the global scope) is legal, because such a declaration does not use _foo as a name. Rather, the identifier is just a part of the actual name, which is operator""_foo (which does not start with an underscore).

like image 152
Brian Bi Avatar answered Sep 20 '22 01:09

Brian Bi


Is every “normal” use of user-defined literals undefined behavior?

Clearly not.

The following is the idiomatic (and thus definitely “normal”) use of UDLs, and it’s well-defined according to the rule you’ve just listed:

namespace si {
    struct metre { … };

    constexpr metre operator ""_m(long double value) { return metre{value}; }
}

You’ve listed problematic cases and I agree with your assessment about their validity but they’re easily avoided in idiomatic C++ code so I don’t entirely see the problem with the current wording, even if it was potentially accidental.

According to the example in [over.literal]/8, we can even use capital letters after the underscore:

float operator ""E(const char*);    // error: reserved literal suffix (20.5.4.3.5, 5.13.8)
double operator""_Bq(long double);  // OK: does not use the reserved identifier _Bq (5.10)
double operator"" _Bq(long double); // uses the reserved identifier _Bq (5.10)

The only problematic thing thus seems to be the fact that the standard makes the whitespace between "" and the UDL name significant.

like image 38
Konrad Rudolph Avatar answered Sep 23 '22 01:09

Konrad Rudolph