Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can undefined behavior which would follow a getc() alter program behavior if the getc() exits via SIGINT

Under modern interpretations of "Undefined Behavior", a compiler is entitled to assume that no chain of events which would cause undefined behavior to be "inevitable" will occur, and can eliminate code which would only be applicable in cases where code is going to perform Undefined Behavior; this may cause the effects of Undefined Behavior to work backwards in time and nullify behaviors that would otherwise have been observable. On the other hand, in cases where Undefined Behavior would be inevitable unless a program terminates, but where a program could and does terminate prior to invoking Undefined Behavior, behavior of the program would remain fully defined.

In making this determination, what causes of termination is a compiler required to consider? As a couple of examples:

On many platforms, a call to a function like "getc" will normally return (at least eventually), but under some cases outside the control of the compiler will not. If one had a program like:

int main(int argc, char *argv[])
{
  if (argc != 3)
  {
    printf("Foo\n");
    return 0;
  }
  else
  {
    int ch;
    printf("You'd better type control-C!\n");
    int ch = getc();
    if (ch < 65)
      return (ch-33) / (argc-3);       
    else
      return INT_MAX + ch;
  }
}

would behavior be defined in case program was called with argc equal to three, but a SIGINT prevented the getc() call from returning at all? Certainly if there were any value that getc() could return which would result in defined behavior, no Undefined Behavior could occur until the compiler could be certain that such input would not be received. In the event that there is no value getc() could return which would avoid Undefined Behavior, however, would overall program remain defined if getc() was prevented from ever returning any value? Would the existence of a causal relationship between the return value of getc() and the actions invoking Undefined Behavior affect things (in the example above, a compiler could not know that any particular form of Undefined Behavior would occur without knowing what character was input, but any possible input would trigger some form).

Likewise, if on a platform there existed addresses which, if read, were specified to cause a program to immediately terminate, a compiler's specified that volatile reads will trigger hardware read requests, and some external library on that platform specified that it would return a pointer to such an address, would those factors imply that the behavior of bar in this separate example:

int foo(int x)
{
  char volatile * p = get_instant_quit_address();
  if (x)
    { printf("Hey"); fflush(stdout); }
  return *p / x; // Will cause UB if *p yields a value and x is zero
}
int bar(void)
{
  return foo(0);
}

would be defined (as terminating without having printed anything) if attempting to read *p would in fact immediate terminate program execution without yielding a value? The division cannot proceed until a value is returned; thus, if no value is returned, there would be no divide by zero.

By what means is a C compiler allowed to determine whether a given action might cause program execution to terminate in ways that it doesn't know about, and in what cases is it allowed to reschedule Undefined Behavior ahead of such actions?

like image 687
supercat Avatar asked Oct 19 '22 13:10

supercat


1 Answers

This is well described in C++ under [intro.execution]:

5 - A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

It is generally accepted that C has the same characteristics, and so that a C compiler can similarly perform "time travel" in the presence of undefined behavior.

Importantly, note that the question is whether there exists an instance of the abstract machine exhibiting undefined behavior; it doesn't matter that you can arrange to prevent undefined behavior on your machine by terminating program execution first.

You can prevent undefined behavior (and resulting time travel) if you cause the program to terminate itself in a fully-defined way which the abstract machine cannot wriggle out of. For example, in your second example if you replace access to *p with (exit(0), 0) then undefined behavior cannot occur as there is no possible execution of the abstract machine where exit returns to its caller. But whatever the characteristics of your platform, the abstract machine does not have to terminate your program on access to an insta-kill address (indeed, the abstract machine does not have any insta-kill addresses).

like image 99
ecatmur Avatar answered Oct 22 '22 06:10

ecatmur