Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How C++ compilers differentiate the token >> for binary operator, and for template

Tags:

c++

parsing

c++11

My doubt is about the parser of C++ compilers as Clang, how the compilers handle the operator >> to know when it is a binary operator and when it is closing a template like: std::vector<std::tuple<int, double>>, I imagine that is done in parser time, so the better way to solve that is on lexical or use only > as token, and solve the problem in the grammar parser?

like image 1000
Alex Avatar asked Jun 07 '18 14:06

Alex


Video Answer


1 Answers

It's actually quite simple: if there is an open template bracket visible, a > closes it, even if the > would otherwise form part of a >> operator. (This doesn't apply to > characters which are part of other tokens, such as >=.) This change to C++ syntax was part of C++11, and is described in paragraph 3 of §13.3 [temp.names].

An open template bracket is not visible if the > is inside a parenthetically nested syntax. So the >> in both T<sizeof a[x >> 1]> and T<(x >> 1)> are right shift operators, while T<x >> 1> probably does not parse as expected.

The two implementation strategies are both workable, depending on where you want to put the complexity. If the lexer never generates a >> token; the parser can check that the > tokens in expr '>' '>' expr are adjacent by looking at their source locations. There will be a shift-reduce conflict, which will have to be resolved in favour of reducing the template parameter list. This works because it happens that there is no ambiguity created by separating >> into two tokens, but that's not a general rule: a + ++ b is different from a ++ + b; if the lexer were only generating + tokens, that would be ambiguous.

It's not too complicated to resolve the issue with a lexer hack, if you are prepared to have your lexer track parenthesis depth. That means the lexer has to know whether a < is a template bracket or a comparison operator, but it is quite possible that it does.

This is the more interesting question (at least imho): how is a < recognised as a template bracket rather than a less-than operator? Here there really is semantic feedback: it is a template bracket if it follows a name which designates a template.

This is not a simple determination. The name could be a class or union member, and even a member of a specialisation of a templated class or union. In the latter case, it might be necessary to calculate the values of compile-time constant expressions and then do template deduction in order to decide what the name designates.

like image 153
rici Avatar answered Nov 04 '22 02:11

rici