Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why did Technical Corrigendum 2 to ISO/IEC 13211-1:1995 omit "bar" from the "token" rule?

Cor.2 says (only) the following about clause 6.4:

6.4 Tokens

Add as the last syntax rule:

bar (* 6.4 *)
   = [ layout text sequence (* 6.4.1 *) ] ,
     bar token (* 6.4.8 *) ;

Surely another modification to 6.4 is intended, namely to add bar (* 6.4 *) to the definition of token (* 6.4 *), which is as follows in ISO/IEC 13211-1:1995:

token (* 6.4 *)
   = name (* 6.4 *)
   | variable (* 6.4 *)
   | integer (* 6.4 *)
   | float number (* 6.4 *)
   | double quoted list (* 6.4 *)
   | open (* 6.4 *)
   | open ct (* 6.4 *)
   | close (* 6.4 *)
   | open list (* 6.4 *)
   | close list (* 6.4 *)
   | open curly (* 6.4 *)
   | close curly (* 6.4 *)
   | ht sep (* 6.4 *)
   | comma (* 6.4 *)
   ;

Is this a minor omission from Cor.2 or have I misunderstood?

like image 365
Michael Ben Yosef Avatar asked May 01 '16 18:05

Michael Ben Yosef


1 Answers

This is an excellent find to make ISO/IEC 13211-1:1995 more consistent! Yes, with Cor.2:2012 the rule token (* 6.4 *) should be extended by a further alternative

  | bar (* 6.4 *)

On the other hand, I do not see any direct consequence for the omission or extension. But I agree that it would definitely make the standard more understandable.

Here are arguments, why even without this further addition, the current codex is fine:

1. Non-terminal bar is already included as a token

The only place where (* 6.4 *) is used is the subclause for term input built-in predicates 8.14.1 read_term/3, read_term/2, read/1, read/2. There, 8.14.1.1 g) reads:

g) Attempts to parse C_Seq as a sequence of tokens
(6.4),
...

Now "a sequence of tokens (6.4)" is not a precise reference to a specific non-terminal symbol. Under 6.4 there are various non-terminals and in particular also the recently added bar (* 6.4 *). So any token or sequence of tokens defined under (6.4) could be referred to. And, the actual non-terminal that defines a sequences of tokens is called term (* 6.4 *), but "a sequence of tokens (6.4)" and "a term (6.4)" look quite different. So I cannot see a good reason to not include bar in the tokens that are read in by read/1 and family.

Also the term syntax (6.3) always refers to bar (6.4) explicitly.

There is also another reference in 8.14.1.1 k which reads

k) Parses C_Seq as a read-term (6.4) T.,

This is understood to be an error. WDCor.3 reads for 8.14.1.1 k:

k) Parses C_Seq as a read-term (6.4) (6.2.2) T.,

Roughly, read/1 reads a term in two phases: In the first phase characters are read in and immediately parsed as a sequence of tokens until the end token (6.4.8) is encountered. So after having a sequence of characters that already have been parsed as tokens, there is no point in doing this again. Also, this will not produce any concrete terms. Only the definition of read-term in 6.2.2 is capable of determining the term T that step k wants to obtain. 6.4 has nothing to say about terms at all, it is tokens only.

2. A term can be read even if bar is not part of the sequence of tokens

But even if one does not accept the first argument, the procedural description in 8.14.1.1 is still able to parse in 8.14.1.1 g the sequence of characters C_seq as a sequence of tokens, since the token ht sep (* 6.4 *) describes the very same set of character sequences as bar (* 6.4 *) does. The only difference between those non-terminals is that ht sep is exclusively used as the head tail separator for the list notation (6.3.5) whereas bar is used as an infix operator. To summarize, step g and k now parse a text a --> b | c. differently: Step g parses " |" as ht sep and step k parses it as bar — provided there is a fitting operator declaration like

:- op(1105, xfy, '|').
like image 151
false Avatar answered Oct 17 '22 20:10

false