Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Define <LINE-START> and <LINE-END> in a lexer

I am trying to implement a front end which attempts to conform to a subset of this specification.

It seems that many things are clearly defined in the reference, except <LINE-START> and <LINE-END>, which are nevertheless often used.

Here is a citation: "For ease of specification it is convenient to be able to explicitly refer to the point that immediately precedes the beginning of a logical line and the point immediately preceding the final line-terminator of a logical line. This is accomplished using <LINE-START> and <LINE-END> as terminal symbols of the VBA grammars. A <LINE-START> is defined to immediately precede each logical line and a <LINE-END> is defined as replacing the <line-terminator> at the end of each logical line:"

Here are some examples:

line-terminator = (%x000D %x000A) / %x000D / %x000A / %x2028 / %x2029
line-continuation = *WSC underscore *WSC line-terminator
WS = 1*(WSC / line-continuation)

EOL = [WS] LINE-END
logical-line = LINE-START *extended-line LINE-END

if-statement = LINE-START “If” boolean-expression “Then” EOL
               statement-block
               *[else-if-block]
               [else-block]
               LINE-START ((“End” “If”) / “EndIf”)

else-if-block = LINE-START “ElseIf” boolean-expression “Then” EOL
                LINE-START statement-block

else-block = LINE-START “Else” statement-block

Does anyone know where and how to define <LINE-START> and <LINE-END>?

like image 913
SoftTimur Avatar asked Nov 13 '22 01:11

SoftTimur


1 Answers

Given the multiple languages described in the document, I would expect the definition of line start and line end to be part of table 3.2.2 Logical Line Grammar, if these are (as suggested elsewhere in the document) tokens. As you say, they are not defined very clearly.

There is another possibility. The way these terms are used suggests an analogy with BOF/EOF. This would suggest that they are states, independent of the token system used. If this is the case you will need to define these as part of your parser when the appropriate conditions hold. The definition will depend on several factors - the position in the file; the previous, the current, and possibly the next token. (For example, Start Line is the state the document needs to be put into, after advancing from the End Line state, but not if also EOF). These rules will have to be derived from the (fragmentary) definition given, and from appropriate assumptions.

This is not a complete answer, but given the vagueness of the specification on this point, and the multiple usage of these tokens?/states? I find it difficult to see how a more complete answer can be given. HTH.

like image 73
Chris Walton Avatar answered Dec 22 '22 21:12

Chris Walton