The Xtext documentation, such as here: http://www.eclipse.org/Xtext/documentation.html#syntax just seems explain syntactic predicates by giving one example 'dangling else problem'. My naive interpretation of this would be: If you have ambiguous grammar then use => to select the option you want. However the results I am getting suggest that its more complicated than that, is there a better explanation somewhere? To try to understand what is going on I have contrived this simple, but ambiguous, grammar to experiment with (obviously I would not do it like this in real world):
grammar com.euclideanspace.experiment.Mydsl with org.eclipse.xtext.common.Terminals
generate mydsl "http://www.euclideanspace.com/experiment/Mydsl"
Model:
opt=Option;
Option:
(ID Option1 ID)
|
(ID Option2 ID)
;
Option1:
'=='|'+=';
Option2:
'=='|'-=';
This gives the following warnings:
warning(200): ../com.euclideanspace.experiment/src-gen/com/euclideanspace/experiment/parser/antlr/internal/InternalMydsl.g:119:1: Decision can match input such as "RULE_ID '==' RULE_ID" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): ../com.euclideanspace.experiment.ui/src-gen/com/euclideanspace/experiment/ui/contentassist/antlr/internal/InternalMydsl.g:176:1: Decision can match input such as "RULE_ID '==' RULE_ID" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
The grammar is ambiguous because input like 'a == b' can match using either Option1 or Option2. We can remove this warning by adding a syntactic predicate indicated by '=>' before the option that we want to choose for the potentially ambiguous input.
Option:
(ID Option1 ID)
|
=>(ID Option2 ID)
;
We could also put the syntactic predicate inside the bracket like this:
Option:
(ID Option1 ID)
|
(=>ID Option2 ID)
;
Both these positions work, so which is best? It is not clear to me how the second case works, choosing one ID in preference to another also implies Option2 over Option1. However, if we put the syntactic predicate before Option2 (which would appear to make sense as this is the option we want to choose) then we get the warning below:
Option:
(ID Option1 ID)
|
(ID =>Option2 ID)
;
warning(200): ../com.euclideanspace.experiment/src-gen/com/euclideanspace/experiment/parser/antlr/internal/InternalMydsl.g:119:1: Decision can match input such as "RULE_ID '==' RULE_ID" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input
So it is not just a case of putting the syntactic predicate before the option we want to choose. I think I need to understand how the parser scans the grammar so we know where to cut off unwanted options.
Is there an explanation of syntactic predicates which explains the above issues? How can syntactic predicates be hidden by actions?
Martin
A predicate is the part of a sentence, or a clause, that tells what the subject is doing or what the subject is.
Because the subject is the person, place or thing that a sentence is about, the predicate must contain a verb explaining what the subject does. It can also include a modifier, an object or a compliment. The verb (or verb phrase) alone is the simple predicate.
More formally, a syntactic predicate is a form of production intersection, used in parser specifications or in formal grammars. In this sense, the term predicate has the meaning of a mathematical indicator function. If p1 and p2, are production rules, the language generated by both p1 and p2 is their set intersection.
n. 1. in linguistics, the part of a sentence or clause that is not the subject but asserts a property, action, or condition of the subject. The predicate of a sentence may range from a single intransitive verb (as in She smiled) to a long and complex construction.
The alternatives with the predicate have to be listed before the alternative without the predicate. E.g. your rule Option should rather look like this:
Option:
ID =>Option2 ID)
| ID Option1 ID;
Please note that the '==' token will never be consumed as part of Option1 in that case, since it'll always be Option2. You may want to refactor your grammar to remove the duplicate branch there which would save you from using predicates in the first place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With