Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Promotion when evaluating constant integer expressions in preprocessor directives - GCC

NOTE: See my edits below.

ORIGINAL QUESTION:

Came across some curious behaviour which I cannot reconcile:

#if -5 < 0
#warning Good, -5 is less than 0.
#else
#error BAD, -5 is NOT less than 0.
#endif

#if -(5u) < 0
#warning Good, -(5u) is less than 0.
#else
#error BAD, -(5u) is less than 0.
#endif

#if -5 < 0u
#warning Good, -5 is less than 0u.
#else
#error BAD, -5 is less than 0u.
#endif

When compiled:

$ gcc -Wall -o pp_test.elf pp_test.c
pp_test.c:2:6: warning: #warning Good, -5 is less than 0.
pp_test.c:10:6: error: #error BAD, -(5u) is less than 0.
pp_test.c:13:9: **warning: the left operand of "<" changes sign when promoted**
pp_test.c:16:6: error: #error BAD, -5 is less than 0u.

This suggests that the preprocessor follows different type promotion rules when evaluating constant integer expressions. Namely that, when an operator has operands of mixed sign, the signed operand is changed to an unsigned operand. The opposite is (generally) true in C.

I can find nothing in the literature to support this, but it's possible (likely?) that I haven't been thorough enough. Have I missed something? Is this behaviour correct?

As it stands, it seems at though any conditional expression in an #if or #elif directive which involves an explicitly unsigned integer constant may fail to behave as expected, i.e. as it would in C.


EDIT: As per my comments in Sourav Ghosh's answer, my confusion originally stemmed from expressions which included constants specified with L and LL suffixes. The example code I included in my original question was too simplified. Here is a better example:

#if -5LL < 0L
#warning Good, -5LL is less than 0L.
#else
#error BAD, -5LL is NOT less than 0L.
#endif

#if -(5uLL) < 0L
#warning Good, -(5uLL) is less than 0L.
#else
#error BAD, -(5uLL) is less than 0L.
#endif

#if -5LL < 0uL
#warning Good, -5LL is less than 0uL.
#else
#error BAD, -5LL is less than 0uL.
#endif

Building:

$ gcc -Wall -o pp_test.elf pp_test.c
pp_test.c:2:6: warning: #warning Good, -5LL is less than 0L.
pp_test.c:10:6: error: #error BAD, -(5uLL) is less than 0L.
pp_test.c:13:9: warning: the left operand of "<" changes sign when promoted
pp_test.c:16:6: error: #error BAD, -5LL is less than 0uL.

This seems to violate the clause in 6.3.1.8 subsequent to the one posted by Sourav Ghosh (my emphasis):

Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.

It seems to violate this clause because -5LL has a rank which is higher than 0uL, and because the type of the first (signed long long) can indeed represent all of the values of the type of the second (unsigned long). The catch is, the preprocessor doesn't know this.

As mentioned in https://gcc.gnu.org/onlinedocs/gcc-3.0.2/cpp_4.html (my emphasis):

The preprocessor calculates the value of expression. It carries out all calculations in the widest integer type known to the compiler; on most machines supported by GCC this is 64 bits. This is not the same rule as the compiler uses to calculate the value of a constant expression, and may give different results in some cases. If the value comes out to be nonzero, the `#if' succeeds and the controlled text is included; otherwise it is skipped.

What seems to be implied by "carries out all calculations in the widest integer type known to the compiler" is that the operands themselves are treated as though they are specified as that same 'widest' type. In other words, -5 and -5L are treated as though they are -5LL, and 0u and 0uL are treated as though they are 0uLL. This activates the clause quoted by Sourav Ghosh, and leads to the observed behaviour.

In effect, there is only one rank as far as the preprocesser is concerned, so type promotion rules which depend upon operands with different rank are ignored. Is this not indeed different from how the compiler evaluates expressions?


EDIT #2: Here's a real-world example of how the same expression is evaluated differently by the preprocessor than it is by the compiler (taken from Optiboot).

#ifndef BAUD_RATE
#if F_CPU >= 8000000L
#define BAUD_RATE   115200L
#elif F_CPU >= 1000000L
#define BAUD_RATE   9600L
#elif F_CPU >= 128000L
#define BAUD_RATE   4800L
#else
#define BAUD_RATE 1200L
#endif
#endif

#ifndef UART
#define UART 0
#endif

#define BAUD_SETTING (( (F_CPU + BAUD_RATE * 4L) / ((BAUD_RATE * 8L))) - 1 )
#define BAUD_ACTUAL (F_CPU/(8 * ((BAUD_SETTING)+1)))
#define BAUD_ERROR (( 100*(BAUD_ACTUAL - BAUD_RATE) ) / BAUD_RATE)

#if BAUD_ERROR >= 5
#error BAUD_RATE error greater than 5%
#elif (BAUD_ERROR + 5) <= 0
#error BAUD_RATE error greater than -5%
#elif BAUD_ERROR >= 2
#warning BAUD_RATE error greater than 2%
#elif (BAUD_ERROR + 2) <= 0
#warning BAUD_RATE error greater than -2%
#endif

volatile long long int baud_setting = BAUD_SETTING;
volatile long long int baud_actual = BAUD_ACTUAL;
volatile long long int baud_error = BAUD_ERROR;

void foo(void) {
  baud_setting = BAUD_SETTING;
  baud_actual = BAUD_ACTUAL;
  baud_error = BAUD_ERROR;
}

Building for an AVR target:

$ avr-gcc -Wall -c -g -save-temps -o optiboot_pp_test.elf -DF_CPU=8000000L optiboot_pp_test.c

Note how F_CPU was specified as a signed constant.

optiboot_pp_test.c:28:6: warning: #warning BAUD_RATE error greater than -2% [-Wcpp]
     #warning BAUD_RATE error greater than -2%

This works as expected. Examining the object file:

      baud_setting = BAUD_SETTING;
   8:   88 e0           ldi     r24, 0x08       ; 8
   a:   90 e0           ldi     r25, 0x00       ; 0
   c:   a0 e0           ldi     r26, 0x00       ; 0
   e:   b0 e0           ldi     r27, 0x00       ; 0
  10:   80 93 00 00     sts     0x0000, r24
  14:   90 93 00 00     sts     0x0000, r25
  18:   a0 93 00 00     sts     0x0000, r26
  1c:   b0 93 00 00     sts     0x0000, r27
      baud_actual = BAUD_ACTUAL;
  20:   87 e0           ldi     r24, 0x07       ; 7
  22:   92 eb           ldi     r25, 0xB2       ; 178
  24:   a1 e0           ldi     r26, 0x01       ; 1
  26:   b0 e0           ldi     r27, 0x00       ; 0
  28:   80 93 00 00     sts     0x0000, r24
  2c:   90 93 00 00     sts     0x0000, r25
  30:   a0 93 00 00     sts     0x0000, r26
  34:   b0 93 00 00     sts     0x0000, r27
      baud_error = BAUD_ERROR;
  38:   8d ef           ldi     r24, 0xFD       ; 253
  3a:   9f ef           ldi     r25, 0xFF       ; 255
  3c:   af ef           ldi     r26, 0xFF       ; 255
  3e:   bf ef           ldi     r27, 0xFF       ; 255
  40:   80 93 00 00     sts     0x0000, r24
  44:   90 93 00 00     sts     0x0000, r25
  48:   a0 93 00 00     sts     0x0000, r26
  4c:   b0 93 00 00     sts     0x0000, r27

... shows that the expected values are assigned. Namely, baud_setting gets 8, baud_actual gets 111111, and baud_error gets -3.

Now we build with F_CPU defined as an unsigned constant (as is customary on this target):

$ avr-gcc -Wall -c -g -save-temps -o optiboot_pp_test.elf -DF_CPU=8000000UL optiboot_pp_test.c 
optiboot_pp_test.c:22:6: error: #error BAUD_RATE error greater than 5%
     #error BAUD_RATE error greater than 5%

The reported error is of the wrong magnitude, and the wrong sign.

Examination of the object file shows it to be identical to the one built with a signed value for F_CPU.

None of this is a surprise now, with the understanding that the preprocessor treats all constants as either the signed or unsigned variant of the widest integer type.

The surprise is that this isn't explicitly mentioned in either the standard, nor the GCC docs (that I can find).

Yes, the C rules for evaluating operands are followed exactly by the preprocessor, but only insofar as the case where both operands of a binary operator are of the same rank. I cannot find any text in the standard which states that the preprocessor treats all constants specified with or without L or LL as though they were all LL before the rules for integer promotions specified in 6.3.1.8 are enforced, nor can I find any mention of this behaviour in the GCC docs. The closest is the passage from the GCC docs quoted above stating that the preprocessor "carries out all calculations in the widest integer type known to the compiler".

This does not (should not) explicitly mean that the operands are treated as though they were specified with suffixes designating them as the widest integer type known to the compiler. Indeed, absent an explicit passage on the subject, my expectation would be that the operands would be subject to the same type conversion and integer promotion rules to which all operands are subject when evaluated by the compiler. This doesn't seem to be the case. The implication, based on the tests above, is that the application of the normal C integer promotion rules comes after the preprocessor promotes the operands to the widest (signed or unsigned) integer type known to the compiler.

If someone can show any explicit and relevant text on this subject, either from the standard or the GCC docs, I'm interested.


EDIT #3: note: I've copied the below paragraphs from the comments section into the post itself, since there were too many comments for it to be seen.

If someone can show any explicit and relevant text on this subject, either from the standard or the GCC docs, I'm interested.

Here's some text from 6.10.1:

  1. For the purposes of this token conversion and evaluation, all signed integer types and all unsigned integer types act as if they have the same representation as, respectively, the types intmax_t and uintmax_t defined in the header <stdint.h>.

That would seem to clinch it.

like image 640
joeymorin Avatar asked Mar 16 '23 08:03

joeymorin


1 Answers

To quote the usual arithmetic conversion rule, (emphasis mine) from C11 standard, chapter §6.3.1.8.

Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

So is your case.

In general, if you try to perform some operation involving both signed and unsigned type, both the operands will get promoted to unsigned type first and then the operation will take place.

like image 94
Sourav Ghosh Avatar answered Apr 06 '23 05:04

Sourav Ghosh