I'm curious about every context in which a colon (the ":" character) is a valid syntactic element (outside of a string/character literal, comment, etc) in a C program.
I tried searching C99 spec, but ":" matches every single page and "colon" doesn't find every usage. Similarly, by looking through toy C parsers (and I understand that lex/yacc aren't capable of parsing C) I only seem to find partial results.
These are the scenarios that I know use a colon:
Are there any other language features in C that use a colon?
It's commonly used to pack lots of values into an integral type. In your particular case, it defining the structure of a 32-bit microcode instruction for a (possibly) hypothetical CPU (if you add up all the bit-field lengths, they sum to 32).
Bookmark this question. Show activity on this post. I'm curious about every context in which a colon (the ":" character) is a valid syntactic element (outside of a string/character literal, comment, etc) in a C program.
The colon operator, :, makes sequences of integers. For example, 4:7 creates the vector 〈4, 5, 6, 7〉. The combine function and the colon operator are used very often in R programming. The colon operator has precedence over basic arithmetical operators, but not over the power operator.
The C standard (N1570) defines digraphs:
6.4.6 Punctuators
....3 In all aspects of the language, the six tokens
<:
:>
<%
%>
%:
%:%:
behave, respectively, the same as the six tokens 79)
[
]
{
}
#
##
except for their spelling.80)
79) These tokens are sometimes called ‘‘digraphs’’.
80) Thus
[
and<:
behave differently when ‘‘stringized’’ (see 6.10.3.2), but can otherwise be freely interchanged.
As a side note, C++ standard elaborates on the term:
The term “digraph” (token consisting of two characters) is not perfectly descriptive, since one of the alternative preprocessing-tokens is
%:%:
and of course several primary tokens contain two characters. Nonetheless, those alternative tokens that aren’t lexical keywords are colloquially known as “digraphs”.
According to Digraphs and trigraphs:
In 1994 a normative amendment to the C standard, included in C99, supplied digraphs as more readable alternatives to five of the trigraphs. ....
Unlike trigraphs, digraphs are handled during tokenization, and any digraph must always represent a full token by itself, or compose the token
%:%:
replacing the preprocessor concatenation token##
. If a digraph sequence occurs inside another token, for example a quoted string, or a character constant, it will not be replaced.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With