Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++17 why not remove digraphs along with trigraphs?

C++17 removed trigraphs. IBM heavily opposed this (here and here) so there seem to be arguments for both sides of removal/non removal.

But since the decision was made to remove trigraphs, why leave digraphs? I don't see any reasons for keeping digraphs beyond the reasons to keep trigraphs (which apparently didn't weight enough to keep them).

like image 305
bolov Avatar asked Dec 22 '14 11:12

bolov


People also ask

Why does C have trigraphs?

Trigraph sequences allow C programs to be written using only the ISO (International Standards Organization) Invariant Code Set. Trigraphs are sequences of three characters (introduced by two consecutive question marks) that the compiler replaces with their corresponding punctuation characters.

Why do trigraphs exist?

Various reasons exist for using digraphs and trigraphs: keyboards may not have keys to cover the entire character set of the language, input of special characters may be difficult, text editors may reserve some characters for special use and so on.

What is a digraph C?

C Language Multi-Character Character Sequence Digraphs These use only two characters and are known as digraphs. Unlike trigraphs, digraphs are tokens. If a digraph occurs in another token (e.g. string literals or character constants) then it will not be treated as a digraph, but remain as it is.

How do you call a three character sequence which represents a single character and the sequence always starts with two questions marks in C++?

A trigraph is a three-character sequence that represents a single character. The sequence always starts with two question marks.


1 Answers

Trigraphs are more problematic to the unaware user than digraphs. This is because they are replaced within string literals and comments. Here are some examples…

Example A:

std::string example = "What??!??!"; std::cout << example << std::endl; 

What|| will be printed to the console. This is because of the trigraph ??! being translated to |.

Example B:

// Error ?!?!?!??!??/ std::cout << "There was an error!" << std::endl; 

Nothing will happen at all. This is because ??/ translates to \, which escapes the newline character and results in the next line being commented out.

Example C:

// This makes no sense ?!?!!?!??!??/ std::string example = "Hello World"; std::cout << example << std::endl; 

This will give an error along the lines of use of undeclared identifier "example" for the same reasons as Example B.

There are far more elaborate problems trigraphs can cause too, but you get the idea. It's worth noting that many compilers actually emit a warning when such translations are being made; yet another reason to always treat warnings as errors. However this is not required by the standard and therefore cannot be relied upon.

Digraphs are much less problematic than trigraphs, as they are not replaced inside another token (i.e. a string or character literal) and there is not a sequence that translates to \, so escaping new lines in comments cannot occur.

Conclusion

Other than harder to read code, there are less problems caused by digraphs and therefore the need to remove them is greatly reduced.

like image 143
OMGtechy Avatar answered Sep 22 '22 07:09

OMGtechy