I have a rule definition like this: <pre class="prettyprint"><code>reference: volume':'first_page'-'last_page ; volume: INTEGER; first_page: INTEGER; last_page: INTEGER; INTEGER: [0-9]+; FREE_TEXT_WORD: NON_SPACE+; fragment NON_SPACE : ~[ \r\n\t]; </code></pre> Given the input "168:321-331", I thought it would match the reference rule. But in reality, the whole string is tokenized as a FREE_TEXT_WORD. How can I make the INTEGER token take preference over FREE_TEXT_WORD in this case? Thanks.

ANTLR will always use a longer token over a shorter token, so to correct this situation you must do one of the following things: <ol> <li> Make the <code>FREE_TEXT_WORD</code> not match more than 3 characters for the input <code>168:321-331</code>, e.g. by not allowing it to contain a digit, or possibly removing the rule altogether. <ul> <li> You could also change <code>FREE_TEXT_WORD</code> to <code>FREE_TEXT_CHARACTER</code>. By limiting the rule to only matching a single character, it will never be longer than another token so its priority will be determined by its position in the grammar. You would then need to create a parser rule for words: <pre class="prettyprint"><code>freeTextWord : FREE_TEXT_CHARACTER+; </code></pre> </li> </ul> </li> <li>Move the <code>FREE_TEXT_WORD</code> token into a mode which is not enabled at the point where your input reaches <code>168:321-331</code>.</li> </ol>

Antlr token priority

Tags:

java

antlr

I have a rule definition like this:

reference: volume':'first_page'-'last_page ;

volume: INTEGER;
first_page: INTEGER;
last_page: INTEGER;

INTEGER: [0-9]+;

FREE_TEXT_WORD: NON_SPACE+;

fragment NON_SPACE : ~[ \r\n\t];

Given the input "168:321-331", I thought it would match the reference rule. But in reality, the whole string is tokenized as a FREE_TEXT_WORD.

How can I make the INTEGER token take preference over FREE_TEXT_WORD in this case?

Thanks.

322

asked Aug 21 '13 15:08

Wudong

1 Answers

ANTLR will always use a longer token over a shorter token, so to correct this situation you must do one of the following things:

Make the FREE_TEXT_WORD not match more than 3 characters for the input 168:321-331, e.g. by not allowing it to contain a digit, or possibly removing the rule altogether.
- You could also change FREE_TEXT_WORD to FREE_TEXT_CHARACTER. By limiting the rule to only matching a single character, it will never be longer than another token so its priority will be determined by its position in the grammar. You would then need to create a parser rule for words:
```
freeTextWord : FREE_TEXT_CHARACTER+;
```
Move the FREE_TEXT_WORD token into a mode which is not enabled at the point where your input reaches 168:321-331.

answered Sep 22 '22 13:09

Sam Harwell

Related questions
                            
                                Weka error "cannot handle numeric class" in Java code using LibSVM
                            
                                Deserialization of enum in Java
                            
                                How to hide the Week numbers in the CalendarView (DatePicker)
                            
                                Enum Base Classes in Java
                            
                                Possible Memory Leak due to org.hibernate.internal.SessionFactoryImpl
                            
                                Stop flickering in swing when i repaint too much
                            
                                YouTube API 3 Upload Video - Access not configured - Android
                            
                                Java OpenCV deskewing a contour
                            
                                What is the correct usage of zxjdbc to call stored procedures?
                            
                                Why won't my objects die?
                            
                                How to move a visible image diagonally?
                            
                                What kind of global variable is bad practice in java?
                            
                                Android Threading: This Handler class should be static or leaks might occur [duplicate]
                            
                                How to parse a Formatted String into Date object in Java
                            
                                Java multithreading errors handling
                            
                                Design decision of boolean containsAll(Collection<?> c) vs boolean addAll(Collection<? extends E> c); in collection framework [duplicate]
                            
                                Impossible typing when an argument accepts Collection<X<?>>
                            
                                Wildcard pattern for RoutingAppender of Log4j2
                            
                                Swing Thread Safe Programming
                            
                                System.getProperty("user.name") returns HOSTNAME instead of currently logged username

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With