Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can user defined numeric literals be immediately followed by a dot? [duplicate]

Since C++11, it has been possible to create User Defined Literals. As expected, it's possible to return complex structs from such literals. However, when trying to use such operators as 123_foo.bar():

struct foo {
    int n;
    int bar() const { return n; }
};

constexpr foo operator ""_foo(unsigned long long test)
{
    return foo{ static_cast<int>(test) };
}

int main() {
    return 123_foo.bar();
}

GCC and Clang reject it, saying they can't find an operator""_foo.bar. MSVC accepts it. If I instead write 123_foo .bar(), all three compilers accept it

Who is right here? Is 123_foo.bar() ever valid?


Some extra information:

  • All three accept it for string literals
  • The problem exists for std::chrono literals as well

I'm inclined to believe that this is a GCC and Clang bug, as . is not part of a valid identifier.

like image 245
Justin Avatar asked Mar 01 '18 07:03

Justin


People also ask

What are user defined literals?

In a raw user-defined literal, the operator that you define accepts the literal as a sequence of char values. It's up to you to interpret that sequence as a number or string or other type. In the list of operators shown earlier in this page, _r and _t can be used to define raw literals: C++ Copy.

What is a numeric literal C++?

Numeric literals specify numeric values. There are two types of numeric literals: integer and floating point. You can assign a numeric literal to any of the numeric data types or the money data type without using an explicit conversion function.


1 Answers

TLDR Clang and GCC are correct, you can't write a . right after a user defined integer/floating literal, this is a MSVC bug.

When a program gets compiled, it goes through 9 phases of translations in order. The key thing to note here is lexing (seperating) the source code into tokens is done before taking into consideration its semantic meaning.

In this phase, maximal munch is in effect, that is, tokens are taken as the longest sequence of characters that is syntactically valid. For example x+++++y is lexed as x ++ ++ + y instead of x + ++ ++ y even if the former isn't semantically valid.

The question is then what is the longest syntactically valid sequence for 123_foo.bar. Following the production rules for a preprocessing number, the exact sequence is

pp-number → pp-number identifier-nondigit → ... → pp-number identifier-nondigit³ →
pp-number nondigit³ → pp-number . nondigit³ → ... → pp-number nondigit⁴ . nondigit³ →
pp-number digit nondigit⁴ . nondigit³ → ... → pp-number digit² nondigit⁴ . nondigit³ →
digit³ nondigit⁴ . nondigit³

Which resolves to 123_foo.bar as seen in the error message

like image 177
Passer By Avatar answered Sep 22 '22 03:09

Passer By