Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there any C99 compilers where with default settings -1>>1 != -1?

Many people frequently point out in discussions of the right-shift operator that the C standard explicitly states that the effect of right-shifting a negative number is implementation defined. I can understand the historical basis for that statement, given that C compilers have been used to generate code for a variety of platforms which do not use two's-complement arithmetic. All new-product development that I'm aware of, however, has centered around processors which have no inherent support for any kind of integer arithmetic other than two's-complement.

If code wishes to perform a floored signed integer division by a power of two, and it is only going to be run for current or future architectures, is there any realistic danger that any future compiler is going to interpret the right-shift operator as doing anything else? If there is a realistic possibility, is there any good way to provide for it without adversely affecting readability, performance, or both? Are there any other dependencies which would justify making an outright assumption of the operator's behavior (e.g. code will be useless on implementations that don't support function X, and implementations are unlikely to support X if they don't use sign-extended right shifts)?

Note: I ask under the C99 and C11 tags because I would expect that newer language features would be among the things which, if supported, would suggest that a platform is probably going to use a right-shift which is arithmetically equivalent to floored division, and would be interested in knowing of any C99 or C11 compilers which implement right-shift any other way.

like image 610
supercat Avatar asked Nov 20 '13 17:11

supercat


2 Answers

This is just one of many reasons why this is so, but consider the signal processing case:

1111 0001 >> 1
0000 1111 >> 1

In the form of shift right arithmetic (SRA) you refer to you would get the following:

1111 0001 >> 1 = 1111 1000
OR
-15 >> 1 = -8

0000 1111 >> 1 = 0000 0111
OR
15 >> 1 = 7

So what's the problem? Consider a digital signal with an amplitude of 15 "units". Dividing this signal by 2 should yield equivalent behavior regardless of sign. However, with a SRA as above, the positive signal of 15 would result in a signal with 7 amplitude, while a negative signal of 15 would result in a signal with 8 amplitude. This unevenness results in a DC bias in the output. For this reason, some DSP processors choose to implement a "round to 0" shift right arithmetic, or other methods altogether. Because the C99 standard is worded as it is, these processors can still be compliant.

On these processors, -1 >> 1 == 0

Related Wiki

like image 89
Sam Cristall Avatar answered Sep 28 '22 05:09

Sam Cristall


Theoretically, nowadays there are subtleties in compiler implementations that could abuse the so-called "undefined behavior", beyond what the backend cpu would do to the actual integers on registers (or "files" or memory locations or whatever):

  1. Cross compilers are usual stuff: the compiler may abuse implementation dependent specs when executing simple calculations itself. Consider the case where the target architecture implements this one way and the hosting the other. In your particular example, compile-time constants could end up as 1, even if whatever assembly output in the target architecture would otherwise gives 0 (I can't think of no such architecture). And then again, vice-versa. There would be no requirement (other than user-base complaining) for a compiler implementer to otherwise care.

  2. Consider CLANG and other compilers that generate intermediate abstract code. There's nothing preventing the type mechanics to optimize some operations up to the last bit at intermediate time on some code paths (i.e. when it can reduce code to constants, loop folding comes to mind), while leaving the assembly backend to resolve this at runtime in other paths. In other words, you could see mixed behavior. In this kind of abstract, there's no obligation from the implementer to obey any standards other than whatever the C language expects. Think the case where all integer math is done by arbitrary precision arithmetics libraries instead of direct mapping to the host cpu integers. The implementation may decide for whatever reason that this is undefined and would return 0. It can do so for any of the signed arithmetic undefined behavior, and there's lots of then in the ISO C standard, specially the wrapping and such.

  3. Consider the (theoretical) case where instead of emitting the full instruction to do the low-level op, the compiler hijacks a sub-operation. An example of this is the ARM with barrel shifter: an explicit instruction (i.e. add or whatever) could have a range and semantics, but the sub-operation could operate with slightly different limits. The compiler could exploit this up to limits where behavior could differ, for instance one case can set result flags and the other not. I can't think of a concrete case where it matters, but it's a possibility that some weird instruction can only deal with subsets of "otherwise normal behavior" and the compiler may assume it's a nice optimization since undefined behavior is supposed to really means undefined :-)

Apart from weird architectures where you would actually have weird behavior at runtime, these are some of the reasons I can think of why you can't assume anything beyond the undefined behavior.

Having said all that, however, we must also consider:

  1. You asked for a C99 compiler. Most weird architectures (i.e. embedded targets) don't have a C99 compiler.
  2. Most "large scale" compiler implementers deals with very large user code bases and generally speaking face support nightmares by over-optimizing minor subtleties. So they don't. Or they do it the way other players are doing.
  3. In the particular case of signed integer "undefined behavior", normally the complementary unsigned operation is a defined operation, i.e. I've seen code casting signed to unsigned only to do the op and then casting the result back.

I think the best straight answer I could give is "you might assume all this is irrelevant, but perhaps you shouldn't".

like image 27
Alexandre Pereira Nunes Avatar answered Sep 28 '22 05:09

Alexandre Pereira Nunes