Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a backslash-newline combo affect the value of the C preprocessor's __LINE__ macro?

Per the C standard, collapsing multiple physical lines joined by a backslash-newline sequence is an earlier phase of translation than executing the preprocessor.

Assuming no complications due to an earlier #line directive, then, does the value of the __LINE__ macro reflect the physical line number before such lines are spliced? That is what you'd find e.g. by manually inspecting the source or what a text editor would report the line # as, and would probably be the more useful alternative. Or does it reflect the line # subsequent to splicing, which presumably is what the preprocessor would actually see given the order of translation phases actually specified in the standard?

(And if I understand correctly--which I very well may not--the preprocessor would have no way of knowing whether a given line was the product of splicing or not.)

like image 738
Kurt Weber Avatar asked Dec 02 '20 03:12

Kurt Weber


1 Answers

Compilers implement __LINE__ by remembering physical line numbers in ways not specified by the C standard.

C 2018 6.10.8.1 1 tells us __LINE__ is replaced by “The presumed line number (within the current source file) of the current source line (an integer constant).” This specification is vague and cannot be implemented in a useful way while adhering to the standard literally.

Consider this code:

#define Assert(test) do { if (!test) printf("Assertion on line %d failed.\n", __LINE__); } while (0)

... Many lines of code follow, including some with line splicing.

    Assert(condition);

... Many lines of code.

To be useful, this code must print the physical line number on which the Assert is used. It needs to be the physical line number so that the user can locate the line in a text editor, and it needs to be the line on which the Assert macro is replaced, not defined, because that is where the problem is detected. Both GCC and Clang do this.

However, this requires that the physical line number from before line splicing be provided during macro replacement, which occurs after line splicing. In C 2018 5.1.1.2 1, the standard specifies a translation model in which:

  • in phase 2, “Each instance of a backslash character () followed immediately by a new-line character is deleted, splicing physical source lines to form logical source lines,” and,
  • in phase 3, “The source file is decomposed into preprocessing tokens and white-space characters,” including new-line characters but not ones deleted in phase 2, and,
  • in phase 4, macro invocations are expanded.

So, if a compiler replaces a __LINE__ macro in phase 4 and literally has only the preprocessing tokens and remaining white-space characters, it cannot know the physical line number to provide.

Therefore, a compiler cannot be implemented literally following the standard’s model of translation. To be useful, it must associate a physical line number with each preprocessing token that could be a macro name. Whenever a macro is replaced, it must propagate the associated physical line number. Then, when a __LINE__ token is finally replaced, the compiler will have the associated physical line number to replace it with.

like image 178
Eric Postpischil Avatar answered Sep 29 '22 22:09

Eric Postpischil