NOTE: See my edits below.
ORIGINAL QUESTION:
Came across some curious behaviour which I cannot reconcile:
#if -5 < 0
#warning Good, -5 is less than 0.
#else
#error BAD, -5 is NOT less than 0.
#endif
#if -(5u) < 0
#warning Good, -(5u) is less than 0.
#else
#error BAD, -(5u) is less than 0.
#endif
#if -5 < 0u
#warning Good, -5 is less than 0u.
#else
#error BAD, -5 is less than 0u.
#endif
When compiled:
$ gcc -Wall -o pp_test.elf pp_test.c
pp_test.c:2:6: warning: #warning Good, -5 is less than 0.
pp_test.c:10:6: error: #error BAD, -(5u) is less than 0.
pp_test.c:13:9: **warning: the left operand of "<" changes sign when promoted**
pp_test.c:16:6: error: #error BAD, -5 is less than 0u.
This suggests that the preprocessor follows different type promotion rules when evaluating constant integer expressions. Namely that, when an operator has operands of mixed sign, the signed operand is changed to an unsigned operand. The opposite is (generally) true in C.
I can find nothing in the literature to support this, but it's possible (likely?) that I haven't been thorough enough. Have I missed something? Is this behaviour correct?
As it stands, it seems at though any conditional expression in an #if or #elif directive which involves an explicitly unsigned integer constant may fail to behave as expected, i.e. as it would in C.
EDIT: As per my comments in Sourav Ghosh's answer, my confusion originally stemmed from expressions which included constants specified with L
and LL
suffixes. The example code I included in my original question was too simplified. Here is a better example:
#if -5LL < 0L
#warning Good, -5LL is less than 0L.
#else
#error BAD, -5LL is NOT less than 0L.
#endif
#if -(5uLL) < 0L
#warning Good, -(5uLL) is less than 0L.
#else
#error BAD, -(5uLL) is less than 0L.
#endif
#if -5LL < 0uL
#warning Good, -5LL is less than 0uL.
#else
#error BAD, -5LL is less than 0uL.
#endif
Building:
$ gcc -Wall -o pp_test.elf pp_test.c
pp_test.c:2:6: warning: #warning Good, -5LL is less than 0L.
pp_test.c:10:6: error: #error BAD, -(5uLL) is less than 0L.
pp_test.c:13:9: warning: the left operand of "<" changes sign when promoted
pp_test.c:16:6: error: #error BAD, -5LL is less than 0uL.
This seems to violate the clause in 6.3.1.8 subsequent to the one posted by Sourav Ghosh (my emphasis):
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
It seems to violate this clause because -5LL
has a rank which is higher than 0uL
, and because the type of the first (signed long long
) can indeed represent all of the values of the type of the second (unsigned long
). The catch is, the preprocessor doesn't know this.
As mentioned in https://gcc.gnu.org/onlinedocs/gcc-3.0.2/cpp_4.html (my emphasis):
The preprocessor calculates the value of expression. It carries out all calculations in the widest integer type known to the compiler; on most machines supported by GCC this is 64 bits. This is not the same rule as the compiler uses to calculate the value of a constant expression, and may give different results in some cases. If the value comes out to be nonzero, the `#if' succeeds and the controlled text is included; otherwise it is skipped.
What seems to be implied by "carries out all calculations in the widest integer type known to the compiler" is that the operands themselves are treated as though they are specified as that same 'widest' type. In other words, -5
and -5L
are treated as though they are -5LL
, and 0u
and 0uL
are treated as though they are 0uLL
. This activates the clause quoted by Sourav Ghosh, and leads to the observed behaviour.
In effect, there is only one rank as far as the preprocesser is concerned, so type promotion rules which depend upon operands with different rank are ignored. Is this not indeed different from how the compiler evaluates expressions?
EDIT #2: Here's a real-world example of how the same expression is evaluated differently by the preprocessor than it is by the compiler (taken from Optiboot).
#ifndef BAUD_RATE
#if F_CPU >= 8000000L
#define BAUD_RATE 115200L
#elif F_CPU >= 1000000L
#define BAUD_RATE 9600L
#elif F_CPU >= 128000L
#define BAUD_RATE 4800L
#else
#define BAUD_RATE 1200L
#endif
#endif
#ifndef UART
#define UART 0
#endif
#define BAUD_SETTING (( (F_CPU + BAUD_RATE * 4L) / ((BAUD_RATE * 8L))) - 1 )
#define BAUD_ACTUAL (F_CPU/(8 * ((BAUD_SETTING)+1)))
#define BAUD_ERROR (( 100*(BAUD_ACTUAL - BAUD_RATE) ) / BAUD_RATE)
#if BAUD_ERROR >= 5
#error BAUD_RATE error greater than 5%
#elif (BAUD_ERROR + 5) <= 0
#error BAUD_RATE error greater than -5%
#elif BAUD_ERROR >= 2
#warning BAUD_RATE error greater than 2%
#elif (BAUD_ERROR + 2) <= 0
#warning BAUD_RATE error greater than -2%
#endif
volatile long long int baud_setting = BAUD_SETTING;
volatile long long int baud_actual = BAUD_ACTUAL;
volatile long long int baud_error = BAUD_ERROR;
void foo(void) {
baud_setting = BAUD_SETTING;
baud_actual = BAUD_ACTUAL;
baud_error = BAUD_ERROR;
}
Building for an AVR target:
$ avr-gcc -Wall -c -g -save-temps -o optiboot_pp_test.elf -DF_CPU=8000000L optiboot_pp_test.c
Note how F_CPU
was specified as a signed constant.
optiboot_pp_test.c:28:6: warning: #warning BAUD_RATE error greater than -2% [-Wcpp]
#warning BAUD_RATE error greater than -2%
This works as expected. Examining the object file:
baud_setting = BAUD_SETTING;
8: 88 e0 ldi r24, 0x08 ; 8
a: 90 e0 ldi r25, 0x00 ; 0
c: a0 e0 ldi r26, 0x00 ; 0
e: b0 e0 ldi r27, 0x00 ; 0
10: 80 93 00 00 sts 0x0000, r24
14: 90 93 00 00 sts 0x0000, r25
18: a0 93 00 00 sts 0x0000, r26
1c: b0 93 00 00 sts 0x0000, r27
baud_actual = BAUD_ACTUAL;
20: 87 e0 ldi r24, 0x07 ; 7
22: 92 eb ldi r25, 0xB2 ; 178
24: a1 e0 ldi r26, 0x01 ; 1
26: b0 e0 ldi r27, 0x00 ; 0
28: 80 93 00 00 sts 0x0000, r24
2c: 90 93 00 00 sts 0x0000, r25
30: a0 93 00 00 sts 0x0000, r26
34: b0 93 00 00 sts 0x0000, r27
baud_error = BAUD_ERROR;
38: 8d ef ldi r24, 0xFD ; 253
3a: 9f ef ldi r25, 0xFF ; 255
3c: af ef ldi r26, 0xFF ; 255
3e: bf ef ldi r27, 0xFF ; 255
40: 80 93 00 00 sts 0x0000, r24
44: 90 93 00 00 sts 0x0000, r25
48: a0 93 00 00 sts 0x0000, r26
4c: b0 93 00 00 sts 0x0000, r27
... shows that the expected values are assigned. Namely, baud_setting
gets 8
, baud_actual
gets 111111
, and baud_error
gets -3
.
Now we build with F_CPU defined as an unsigned constant (as is customary on this target):
$ avr-gcc -Wall -c -g -save-temps -o optiboot_pp_test.elf -DF_CPU=8000000UL optiboot_pp_test.c
optiboot_pp_test.c:22:6: error: #error BAUD_RATE error greater than 5%
#error BAUD_RATE error greater than 5%
The reported error is of the wrong magnitude, and the wrong sign.
Examination of the object file shows it to be identical to the one built with a signed value for F_CPU.
None of this is a surprise now, with the understanding that the preprocessor treats all constants as either the signed or unsigned variant of the widest integer type.
The surprise is that this isn't explicitly mentioned in either the standard, nor the GCC docs (that I can find).
Yes, the C rules for evaluating operands are followed exactly by the preprocessor, but only insofar as the case where both operands of a binary operator are of the same rank. I cannot find any text in the standard which states that the preprocessor treats all constants specified with or without L
or LL
as though they were all LL
before the rules for integer promotions specified in 6.3.1.8 are enforced, nor can I find any mention of this behaviour in the GCC docs. The closest is the passage from the GCC docs quoted above stating that the preprocessor "carries out all calculations in the widest integer type known to the compiler".
This does not (should not) explicitly mean that the operands are treated as though they were specified with suffixes designating them as the widest integer type known to the compiler. Indeed, absent an explicit passage on the subject, my expectation would be that the operands would be subject to the same type conversion and integer promotion rules to which all operands are subject when evaluated by the compiler. This doesn't seem to be the case. The implication, based on the tests above, is that the application of the normal C integer promotion rules comes after the preprocessor promotes the operands to the widest (signed or unsigned) integer type known to the compiler.
If someone can show any explicit and relevant text on this subject, either from the standard or the GCC docs, I'm interested.
EDIT #3: note: I've copied the below paragraphs from the comments section into the post itself, since there were too many comments for it to be seen.
If someone can show any explicit and relevant text on this subject, either from the standard or the GCC docs, I'm interested.
Here's some text from 6.10.1:
- For the purposes of this token conversion and evaluation, all signed integer types and all unsigned integer types act as if they have the same representation as, respectively, the types intmax_t and uintmax_t defined in the header <stdint.h>.
That would seem to clinch it.
To quote the usual arithmetic conversion rule, (emphasis mine) from C11
standard, chapter §6.3.1.8.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
So is your case.
In general, if you try to perform some operation involving both signed and unsigned type, both the operands will get promoted to unsigned type first and then the operation will take place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With