Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are digraphs and trigraphs in use today? [closed]

Given that there were once reasons to use digraphs and trigraphs in C and C++, does anyone put them in code being written today? Is there any substantial amount of legacy code still under maintenance that contains them?

(Note: Here, "digraph" does not mean "directed graph." Both digraph and trigraph have multiple meanings, but the intended use here are sequences like ??= or <: to stand in for characters like # and [)

like image 659
rwallace Avatar asked Sep 16 '11 23:09

rwallace


People also ask

How many trigraphs are there in English?

How Many Trigraphs Are There in English? We've included 17 in our list of letter-sound correspondences in English. However, it's possible there are a few very rare trigraphs we've missed. We've listed examples of words with trigraphs in them below.

Why do trigraphs exist?

Various reasons exist for using digraphs and trigraphs: keyboards may not have keys to cover the entire character set of the language, input of special characters may be difficult, text editors may reserve some characters for special use and so on.

What is difference between digraph and trigraph?

A digraph is two letters (two vowels or two consonants or a vowel and a consonant) which together make one sound. A trigraph is a single sound that is represented by three letters, for example: In the word 'match', the three letters 'tch' at the end make only one sound.

What are examples of trigraphs?

A trigraph is where three letters are used to represent one sound (or 'phoneme'). Trigraphs can consist of all consonants, a mixture of consonants and vowels and, in some cases, all vowels. Some examples of trigraphs include 'eau', 'tch', 'igh' and 'air'.


2 Answers

I don't know for sure, but you're most likely to find digraphs and trigraphs being used in IBM mainframe environments. The EBCDIC character set doesn't include some characters that are required for C.

The other justification for digraphs and trigraphs, 7-bit ASCII-ish character sets that replace some punctuation characters with accented letters, is probably less relevant today.

Outside such environments, I suspect that trigraphs are more commonly used by mistake than deliberately, as in:

puts("What happened??!"); 

For reference, trigraphs were introduced in the 1989 ANSI C standard (which essentially became the 1990 ISO C standard). They are:

??= #     ??) ]     ??! | ??( [     ??' ^     ??> } ??/ \     ??< {     ??- ~ 

The replacements occur anywhere in source code, including comments and string literals.

Digraphs are alternate spellings of certain tokens, and do not affect comments or literals:

<: [      :>   ] <% {      %>   } %: #      %:%: ## 

Digraphs were introduced by the 1995 amendment to the 1990 ISO C standard.

like image 64
Keith Thompson Avatar answered Oct 11 '22 13:10

Keith Thompson


There is a proposal pending for C++1z (the next standard after C++1y will be standardized into -hopefully- C++14) that aims to remove trigraphs from the Standard. They did a case study on an otherwise undisclosed large codebase:

Case study

The uses of trigraph-like constructs in one large codebase were examined. We discovered:

923 instances of an escaped ? in a string literal to avoid trigraph replacement: string pattern() const { return "foo-????\?-of-?????"; }

4 instances of trigraphs being used deliberately in test code: two in the test suite for a compiler, the other two in a test suite for boost's preprocessor library.

0 instances of trigraphs being deliberately used in production code. Trigraphs continue to pose a burden on users of C++.

The proposal notes (bold emphasis from the original proposal):

If trigraphs are removed from the language entirely, an implementation that wishes to support them can continue to do so: its implementation-defined mapping from physical source file characters to the basic source character set can include trigraph translation (and can even avoid doing so within raw string literals). We do not need trigraphs in the standard for backwards compatibility.

like image 42
TemplateRex Avatar answered Oct 11 '22 14:10

TemplateRex