Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Valid programs in C89, but not in C99

Are there features / semantics introduced, or removed, in C99 which would make a well defined program written in C89 either

  • invalid (i.e not compiling anymore, according to the C99 standard)
  • compiling, but having different semantics.

My findings so far, concerning plainly invalid programs:

  • implicit int (C89 §3.5.2)
  • implicit function declaration (C89 §3.3.2.2)
  • not returning from a function expecting a return value (C89 §3.6.6.4)
  • using new keywords as identifier (for example restrict, inline, etc)
  • hacks involving //, which are now treated as comments. However, nearly never encountered in production code.

Subtle changes, making the same code having different semantics:

  • Integer division has been made well defined, for example -3 / 2 now has to truncate towards zero (C99 §6.5.5/6), instead of being implementation defined (C89 §3.3.5/6)
  • strtod gained the ability to parse hexadecimal numbers in C99, by parsing 0x or 0X

What have I missed?

like image 313
Leandros Avatar asked Apr 18 '16 19:04

Leandros


People also ask

What is the difference between c89 and C99?

In C89, the results of / and % operators for a negative operand can be rounded either up or down. The sign of i % j for negative i or j depends on the implementation. In C99, the result is always truncated toward zero and the sign of i % j is the sign of i. In C89, declarations must precede statements within a block.

Should I use C99 or C11?

It is best to use C11 as that is the current standard. C99 and C11 both contained various "language bug fixes" and introduced new, useful features.

Is C99 C or C++?

C99 (previously known as C9X) is an informal name for ISO/IEC 9899:1999, a past version of the C programming language standard.

What is C99 and C11?

C11 (formerly C1X) is an informal name for ISO/IEC 9899:2011, a past standard for the C programming language. It replaced C99 (standard ISO/IEC 9899:1999) and has been superseded by C17 (standard ISO/IEC 9899:2018).


1 Answers

There are a lot of programs which would have been considered valid under C89, prior to the publication of C99, which some people insist were never valid. C89 includes a rule that requires that an object of any type may only be accessed using a pointer of that type, a related type, or a character type. Prior to the publication of C99, this rule was generally interpreted as applying only to "named" objects (variables of static or automatic duration which are accessed directly by name), and only in situations where the object in question didn't have its address taken immediately before it was used as a different pointer type. Such interpretation was motivated by a number of factors:

  1. One of the stated goals of the Standard was to fit with what existing compilers and programs were doing, and while it would have been rare for existing programs to access discrete named variables using pointers of different types other than in cases where the variable's address was taken immediately before such use, many other usages of pointer type punning were quite common.

  2. The rationale for the Standard includes as its sole example a function which receives a pointer of one primitive type to write a global variable of another primitive type in such a way that a compiler would have no particular reason to expect aliasing. Being able to keep global variables in registers is clearly a useful optimization, and the stated purpose of the rule is to allow such optimizations in cases where a compiler would have no reason to expect aliasing to occur. Outlawing constructs like like (int*)&foo=23; does nothing to aid such optimizations, since the fact that code is taking foo's address and dereferencing it should make it abundantly clear to any compiler that isn't being deliberately obtuse that the code is going to modify foo.

  3. There are many kinds of code which require semantically the ability to use memory bits as various types, and nothing in the Standard indicate that the rules were intended to make programmers jump through hoops (e.g. by using memcpy) to achieve semantics that could have been easily obtained in the absence of the rules, especially considering that using memcpy would prevent the compiler from keeping global variables in registers across the pointer accesses (thus defeating the purpose for which the rules were written in the first place).

  4. If structure types V and W have a common initial sequence, U is any union type containing both, and p is a V* which identifies the V within a U, then (W*)(U*)p may be used to access those common members, and will be equivalent to (W*)p. Unless a compiler could show that p couldn't possibly be a pointer to a member of some union containing W, it would be required to allow (W*)p to access the common members; it was more helpful to simply treat such common member access as being legitimate regardless of whether or where U might exist than to search for excuses to deny it.

  5. Nothing in the C89 rules makes clear how the "type" of a region of allocated storage is defined, or how storage which holds things of one type that are no longer needed might be re-purposed to hold things of another.

  6. Keeping track of registers allocated to named variables was easier than keeping track of registers allocated to other pointer exceptions, and code which was interested in minimizing the number of loads and stores via pointers would often copy things to named variables and work on them there.

C99 added "effective type" rules which are explicitly applicable to allocated storage. Some people insist those were merely "clarifications" of rules which already existed in C89, but for the above reasons I find that viewpoint untenable. It's fashionable to claim that the only reasons compilers didn't apply aliasing rules to unnamed objects are #5 and #6, but objections #1-#4 are equally significant (and continue to apply to C99 just as much as C89). Still, since C99 added the effective type rules, many constructs which would have been treated as legitimate by most common interpretations of the C89 rules are clearly forbidden.

like image 140
supercat Avatar answered Oct 12 '22 19:10

supercat