I know, right shifting a negative signed type depends on the implementation, but what if I perform a left shift? For example: <pre class="prettyprint"><code>int i = -1; i << 1; </code></pre> Is this well-defined? I think the standard doesn't say about negative value with signed type <blockquote> if E1 has a signed type and non-negative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined. </blockquote> It only clarifies that if the result isn't representable in signed type then the behavior is undefined.

When the C standards were codified, different platforms would do different things when left-shifting negative integers. On some of them, the behavior might trigger implementation-specific traps whose behavior could be outside a program's control, and which could include random code execution. Nonetheless, it was possible that programs written for such platforms might make use of such behavior (a program could e.g. specify that a user would have to do something to configure a system's trap handlers before running it, but the program could then exploit the behavior of the suitably-configured trap handlers). The authors of the C standard did not want to say that compilers for machines where left-shifting of negative numbers would trap must be modified to prevent such trapping (since programs might potentially be relying upon it), but if left-shifting a negative number is allowed to trigger a trap which could cause any arbitrary behavior (including random code execution) that means that left-shifting a negative number is allowed to do anything whatsoever. Hence Undefined Behavior. In practice, until about 5 years ago, 99+% of compilers written for a machine that used two's-complement math (meaning 99+% of machines made since 1990) would consistently yield the following behaviors for <code>x<<y</code> and <code>x>>y</code>, to the extent that code reliance upon such behavior was considered no more non-portable than code which assumed <code>char</code> was 8 bits. The C standard didn't mandate such behavior, but any compiler author wanting to be compatible with a wide base of existing code would follow it. <ul> <li>if <code>y</code> is a signed type, <code>x << y</code> and <code>x >> y</code> are evaluated as though <code>y</code> was cast to unsigned.</li> <li>if <code>x</code> is type <code>int</code>, <code>x<<y</code> is equivalent to <code>(int)((unsigned)x << y)</code>.</li> <li>if <code>x</code> is type <code>int</code> and positive, <code>x>>y</code> equivalent to <code>(unsigned)x >> y</code>. If <code>x</code> is of type <code>int</code> and negative, <code>x>>y</code> is equivalent to `~(~((unsigned)x) >> y).</li> <li>If <code>x</code> is of type <code>long</code>, similar rules apply, but with <code>unsigned long</code> rather than <code>unsigned</code>.</li> <li>if <code>x</code> is an N-bit type and <code>y</code> is greater than N-1, then <code>x >> y</code> and <code>x << y</code> may arbitrarily yield zero, or may act as though the right-hand operand was <code>y % N</code>; they may require extra time proportional to <code>y</code> [note that on a 32-bit machine, if <code>y</code> is negative, that could potentially be a long time, though I only know of one machine which would in practice run more than 256 extra steps]. Compilers were not necessarily consistent in their choice, but would always return one of the indicated values with no other side-effects.</li> </ul> Unfortunately for some reason I can't quite fathom, compiler writers have decided that rather than allowing programmers to indicate what assumptions compilers should use for dead-code removal, compilers should assume that it is impossible to execute any shift whose behavior isn't mandated by the C standard. Thus, given code like the following: <pre class="prettyprint"><code>uint32_t shiftleft(uint32_t v, uint8_t n) { if (n >= 32) v=0; return v<<n; } </code></pre> a compiler may determine that because the code would engage in Undefined Behavior when n is 32 or larger, the compiler may assume that the <code>if</code> will never return true, and may thus omit the code. Consequently, unless or until someone comes up with a standard for C which restores the classic behaviors and allows programmers to designate what assumptions merit dead code removal, such constructs cannot be recommended for any code that might be fed to a hyper-modern compiler.

Is left and right shifting negative integers defined behavior?

Q: Is shifting right negative or positive?

Horizontal Shift If k is positive, then the graph will shift left. If k is negative, then the graph will shift right.

Q: What happens when you right shift a negative number?

For signed numbers, the sign bit is used to fill the vacated bit positions. In other words, if the number is positive, 0 is used, and if the number is negative, 1 is used. The result of a right-shift of a signed negative number is implementation-dependent.

Q: Can you shift by a negative number?

Do not shift an expression by a negative number of bits or by a number greater than or equal to the precision of the promoted left operand. The precision of an integer type is the number of bits it uses to represent values, excluding any sign and padding bits.

Tags:

I know, right shifting a negative signed type depends on the implementation, but what if I perform a left shift? For example:

int i = -1; i << 1;

Is this well-defined?

I think the standard doesn't say about negative value with signed type

if E1 has a signed type and non-negative value, and E1 × 2^E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

It only clarifies that if the result isn't representable in signed type then the behavior is undefined.

528

asked Dec 07 '11 13:12

user103214

2 Answers

You're not reading that sentence correctly. The standard defines it if: the left operand has a signed type and a non-negative value and the result is representable (and previously in the same paragraph defines it for unsigned types). In all other cases (notice the use of the semicolon in that sentence), i.e, if any of these conditions isn't verified, the behaviour is undefined.

147

answered Jan 06 '23 22:01

R. Martinho Fernandes

When the C standards were codified, different platforms would do different things when left-shifting negative integers. On some of them, the behavior might trigger implementation-specific traps whose behavior could be outside a program's control, and which could include random code execution. Nonetheless, it was possible that programs written for such platforms might make use of such behavior (a program could e.g. specify that a user would have to do something to configure a system's trap handlers before running it, but the program could then exploit the behavior of the suitably-configured trap handlers).

The authors of the C standard did not want to say that compilers for machines where left-shifting of negative numbers would trap must be modified to prevent such trapping (since programs might potentially be relying upon it), but if left-shifting a negative number is allowed to trigger a trap which could cause any arbitrary behavior (including random code execution) that means that left-shifting a negative number is allowed to do anything whatsoever. Hence Undefined Behavior.

In practice, until about 5 years ago, 99+% of compilers written for a machine that used two's-complement math (meaning 99+% of machines made since 1990) would consistently yield the following behaviors for x<<y and x>>y, to the extent that code reliance upon such behavior was considered no more non-portable than code which assumed char was 8 bits. The C standard didn't mandate such behavior, but any compiler author wanting to be compatible with a wide base of existing code would follow it.

if y is a signed type, x << y and x >> y are evaluated as though y was cast to unsigned.
if x is type int, x<<y is equivalent to (int)((unsigned)x << y).
if x is type int and positive, x>>y equivalent to (unsigned)x >> y. If x is of type int and negative, x>>y is equivalent to `~(~((unsigned)x) >> y).
If x is of type long, similar rules apply, but with unsigned long rather than unsigned.
if x is an N-bit type and y is greater than N-1, then x >> y and x << y may arbitrarily yield zero, or may act as though the right-hand operand was y % N; they may require extra time proportional to y [note that on a 32-bit machine, if y is negative, that could potentially be a long time, though I only know of one machine which would in practice run more than 256 extra steps]. Compilers were not necessarily consistent in their choice, but would always return one of the indicated values with no other side-effects.

Unfortunately for some reason I can't quite fathom, compiler writers have decided that rather than allowing programmers to indicate what assumptions compilers should use for dead-code removal, compilers should assume that it is impossible to execute any shift whose behavior isn't mandated by the C standard. Thus, given code like the following:

uint32_t shiftleft(uint32_t v, uint8_t n) {   if (n >= 32)     v=0;   return v<<n; }

a compiler may determine that because the code would engage in Undefined Behavior when n is 32 or larger, the compiler may assume that the if will never return true, and may thus omit the code. Consequently, unless or until someone comes up with a standard for C which restores the classic behaviors and allows programmers to designate what assumptions merit dead code removal, such constructs cannot be recommended for any code that might be fed to a hyper-modern compiler.

answered Jan 06 '23 22:01

supercat

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With