Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between Undefined Behavior and Ill-formed, no diagnostic message required

The C++ standard comes with an stunning number of definitions for unclear1 behavior which mean more or less the same with subtle differences. Reading this answer, I noticed the wording "the program is ill-formed; no diagnostic required".

Implementation-defined differs from unspecified behavior in that the implementation in the former case must clearly document what it's doing (in the latter case, it needn't), both are well-formed. Undefined behavior differs from unspecified in that the program is erroneous (1.3.13).
They otherwise all have in common that the standard makes no assumptions or requirements about what the implementation will do. Except for 1.4/8, which states that implementations may have extensions that do not alter the behavior of well-formed programs, but are ill-formed according to the standard, and the implementation must diagnose use of these, but can afterwards continue compiling and executing the ill-formed program.

An ill-formed program is otherwise only defined as being not well-formed (great!). A well-formed program, on the other hand, is defined as one that adheres to the syntax and diagnosable semantic rules. Which would consequently mean that an ill-formed program is one that breaks either the syntax or semantic rules (or both). In other words, an ill-formed program actually shouldn't compile at all (how would one translate e.g. a program with a wrong syntax in any meaningful way?).

I would be inclined to think that the word erroneous also implies that the compiler should abort the build with an error message (after all, erroneous suggests there's an error), but the "Note" section in 1.3.13 explicitly allows for something different, including silently ignoring the problem (and compilers demonstrably do not break the build because of UB, most do not even warn by default).

One might further believe that erroneous and ill-formed are the same, but the standard doesn't go into detail if that is the case or what the word is supposed to mean.

Further, 1.4 states that

a conforming implementation shall [...] accept and correctly execute a well-formed program

and

If a program contains a violation of a rule for which no diagnostic is required, [...] no requirement on implementations with respect to that program.

In other words, a conforming implementation must accept a well-formed program, but it might as well accept an ill-formed one, and even without a warning. Except, if the program is ill-formed because it uses an extension.

The second paragraph suggests that anything in conjunction with "no diagnostic required" means there are no requirements from the specification, which means it is mostly equivalent to "undefined behavior", except there is no mention of erroneous.

What would therefore be the intention behind using a wording such as "ill-formed; no diagnostic required"?

The presence of "no diagnostics" would suggest that it is identical (or mostly identical?) to undefined behavior. Also, since implementation-defined and unspecified behavior are defined as well-formed, it must be something different.

On the other hand, since an ill-formed program breaks the syntax/semantic rules, it actually should not compile. Which, however, in conjunction with "no diagnostic required" would mean that a compiler would be permitted to silently exit without as much as a warning, and you would be unable to find an executable afterwards.

Is there a difference between "ill-formed; no diagnostic required" and "undefined behavior", or is this simply a complicated synonym for the same thing?


1In lack of a better wording for the collective of behaviors
like image 541
Damon Avatar asked Mar 04 '14 18:03

Damon


People also ask

What is undefined behavior in programming?

In computer programming, undefined behaviour is defined as 'the result of compiling computer code which is not prescribed by the specs of the programming language in which it is written'. This article will help you understand this behaviour with the help of a few case studies.

What does undefined behavior mean in C?

When we run a code, sometimes we see absurd results instead of expected output. So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

What type of behavior C is undefined?

In C the use of any automatic variable before it has been initialized yields undefined behavior, as does integer division by zero, signed integer overflow, indexing an array outside of its defined bounds (see buffer overflow), or null pointer dereferencing.

What does ill formed mean C++?

An ill-formed program is a C++ program that is not well-formed; that is, a program not constructed according to the syntax rules, diagnosable semantic rules, and the one-definition rule.


1 Answers

The standard is not always as coherent as we would like, since it is a very large document, written (in practice) by a number of different people, and despite all of the proof-reading that does occur, inconsistencies slip through. In the case of undefined behavior (and errors in general), I think there is an additional problem in that for much of the most basic things (pointers, etc.), the C++ standard inspires from C. But the C standard takes the point of view that all errors are undefined behavior, unless stated otherwise, where as the C++ standard tries to take the point of view that all errors require a diagnostic, unless stated otherwise. (Although they still have to allow for the case where the standard omits to specify a behavior.) I think this accounts for a lot of the inconsistency in the wording.

Globally, the inconsistency is regrettable, but on the whole, if the standard says that something is erroneous, or ill-formed, then it requires a diagnostic, unless the standard says that it doesn't, or that it is undefined behavior. In something like "ill-formed; no diagnostic required", the "no diagnostic required" is important, because otherwise, it would require a diagnostic. As for the difference between "ill-formed; no diagnostic required" and "undefined behavior", there isn't any. The first is probably more frequent in cases where the code is incorrect, the second where it is a run-time issue, but it's not systematic. (The specification of the one definition rule—clearly a compile time issue—ends with "then the behavior is undefined".)

like image 57
James Kanze Avatar answered Oct 06 '22 00:10

James Kanze