Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the common undefined/unspecified behavior for C that you run into? [closed]

An example of unspecified behavior in the C language is the order of evaluation of arguments to a function. It might be left to right or right to left, you just don't know. This would affect how foo(c++, c) or foo(++c, c) gets evaluated.

What other unspecified behavior is there that can surprise the unaware programmer?

like image 629
Benoit Avatar asked Sep 19 '08 00:09

Benoit


People also ask

What type of behavior C is undefined?

According to the C standards, signed integer overflow is undefined behaviour too. A few compilers may trap the overflow condition when compiled with some trap handling options, while a few compilers simply ignore the overflow conditions (assuming that the overflow will never happen) and generate the code accordingly.

What does undefined behavior mean in C?

When we run a code, sometimes we see absurd results instead of expected output. So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

Why does C have so much undefined behavior?

It exists because of the syntax rules of C where a variable can be declared without init value. Some compilers assign 0 to such variables and some just assign a mem pointer to the variable and leave just like that. if program does not initialize these variables it leads to undefined behavior.

What is unspecified in C?

A program is said to have an unspecified behavior when the standard provides two or more possibilities but does not impose requirements on which should be chosen by the compiler writer. For Example, the order in which function fun1 and fun2 are called in the following expression is not specified!


2 Answers

A language lawyer question. Hmkay.

My personal top3:

  1. violating the strict aliasing rule

  2. violating the strict aliasing rule

  3. violating the strict aliasing rule

    :-)

Edit Here is a little example that does it wrong twice:

(assume 32 bit ints and little endian)

float funky_float_abs (float a) {   unsigned int temp = *(unsigned int *)&a;   temp &= 0x7fffffff;   return *(float *)&temp; } 

That code tries to get the absolute value of a float by bit-twiddling with the sign bit directly in the representation of a float.

However, the result of creating a pointer to an object by casting from one type to another is not valid C. The compiler may assume that pointers to different types don't point to the same chunk of memory. This is true for all kind of pointers except void* and char* (sign-ness does not matter).

In the case above I do that twice. Once to get an int-alias for the float a, and once to convert the value back to float.

There are three valid ways to do the same.

Use a char or void pointer during the cast. These always alias to anything, so they are safe.

float funky_float_abs (float a) {   float temp_float = a;   // valid, because it's a char pointer. These are special.   unsigned char * temp = (unsigned char *)&temp_float;   temp[3] &= 0x7f;   return temp_float; } 

Use memcopy. Memcpy takes void pointers, so it will force aliasing as well.

float funky_float_abs (float a) {   int i;   float result;   memcpy (&i, &a, sizeof (int));   i &= 0x7fffffff;   memcpy (&result, &i, sizeof (int));   return result; } 

The third valid way: use unions. This is explicitly not undefined since C99:

float funky_float_abs (float a) {   union    {      unsigned int i;      float f;   } cast_helper;    cast_helper.f = a;   cast_helper.i &= 0x7fffffff;   return cast_helper.f; } 
like image 108
Nils Pipenbrinck Avatar answered Oct 01 '22 09:10

Nils Pipenbrinck


My personal favourite undefined behaviour is that if a non-empty source file doesn't end in a newline, behaviour is undefined.

I suspect it's true though that no compiler I will ever see has treated a source file differently according to whether or not it is newline terminated, other than to emit a warning. So it's not really something that will surprise unaware programmers, other than that they might be surprised by the warning.

So for genuine portability issues (which mostly are implementation-dependent rather than unspecified or undefined, but I think that falls into the spirit of the question):

  • char is not necessarily (un)signed.
  • int can be any size from 16 bits.
  • floats are not necessarily IEEE-formatted or conformant.
  • integer types are not necessarily two's complement, and integer arithmetic overflow causes undefined behaviour (modern hardware won't crash, but some compiler optimizations will result in behavior different from wraparound even though that's what the hardware does. For example if (x+1 < x) may be optimized as always false when x has signed type: see -fstrict-overflow option in GCC).
  • "/", "." and ".." in a #include have no defined meaning and can be treated differently by different compilers (this does actually vary, and if it goes wrong it will ruin your day).

Really serious ones that can be surprising even on the platform you developed on, because behaviour is only partially undefined / unspecified:

  • POSIX threading and the ANSI memory model. Concurrent access to memory is not as well defined as novices think. volatile doesn't do what novices think. Order of memory accesses is not as well defined as novices think. Accesses can be moved across memory barriers in certain directions. Memory cache coherency is not required.

  • Profiling code is not as easy as you think. If your test loop has no effect, the compiler can remove part or all of it. inline has no defined effect.

And, as I think Nils mentioned in passing:

  • VIOLATING THE STRICT ALIASING RULE.
like image 45
Steve Jessop Avatar answered Oct 01 '22 08:10

Steve Jessop