Cor.2 says (only) the following about clause 6.4:
6.4 Tokens
Add as the last syntax rule:
bar (* 6.4 *) = [ layout text sequence (* 6.4.1 *) ] , bar token (* 6.4.8 *) ;
Surely another modification to 6.4 is intended, namely to add bar (* 6.4 *)
to the definition of token (* 6.4 *)
, which is as follows in ISO/IEC 13211-1:1995:
token (* 6.4 *) = name (* 6.4 *) | variable (* 6.4 *) | integer (* 6.4 *) | float number (* 6.4 *) | double quoted list (* 6.4 *) | open (* 6.4 *) | open ct (* 6.4 *) | close (* 6.4 *) | open list (* 6.4 *) | close list (* 6.4 *) | open curly (* 6.4 *) | close curly (* 6.4 *) | ht sep (* 6.4 *) | comma (* 6.4 *) ;
Is this a minor omission from Cor.2 or have I misunderstood?
This is an excellent find to make ISO/IEC 13211-1:1995 more consistent! Yes, with Cor.2:2012 the rule token (* 6.4 *)
should be extended by a further alternative
| bar (* 6.4 *)
On the other hand, I do not see any direct consequence for the omission or extension. But I agree that it would definitely make the standard more understandable.
Here are arguments, why even without this further addition, the current codex is fine:
The only place where (* 6.4 *)
is used is the subclause for term input built-in predicates 8.14.1 read_term/3, read_term/2, read/1, read/2. There, 8.14.1.1 g) reads:
g) Attempts to parse
C_Seq
as a sequence of tokens
(6.4),
...
Now "a sequence of tokens (6.4)" is not a precise reference to a specific non-terminal symbol. Under 6.4 there are various non-terminals and in particular also the recently added bar (* 6.4 *)
. So any token or sequence of tokens defined under (6.4) could be referred to. And, the actual non-terminal that defines a sequences of tokens is called term (* 6.4 *)
, but "a sequence of tokens (6.4)" and "a term (6.4)" look quite different. So I cannot see a good reason to not include bar in the tokens that are read in by read/1 and family.
Also the term syntax (6.3) always refers to bar (6.4) explicitly.
There is also another reference in 8.14.1.1 k which reads
k) Parses
C_Seq
as a read-term (6.4)T.
,
This is understood to be an error. WDCor.3 reads for 8.14.1.1 k:
k) Parses
C_Seq
as a read-term(6.4)(6.2.2)T.
,
Roughly, read/1
reads a term in two phases: In the first phase characters are read in and immediately parsed as a sequence of tokens until the end token (6.4.8) is encountered. So after having a sequence of characters that already have been parsed as tokens, there is no point in doing this again. Also, this will not produce any concrete terms. Only the definition of read-term in 6.2.2 is capable of determining the term T
that step k wants to obtain. 6.4 has nothing to say about terms at all, it is tokens only.
But even if one does not accept the first argument, the procedural description in 8.14.1.1 is still able to parse in 8.14.1.1 g the sequence of characters C_seq
as a sequence of tokens, since the token ht sep (* 6.4 *)
describes the very same set of character sequences as bar (* 6.4 *)
does. The only difference between those non-terminals is that ht sep
is exclusively used as the head tail separator for the list notation (6.3.5) whereas bar
is used as an infix operator. To summarize, step g and k now parse a text a --> b | c.
differently: Step g parses " |"
as ht sep
and step k parses it as bar
— provided there is a fitting operator declaration like
:- op(1105, xfy, '|').
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With