Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Antlr tokens from file

Tags:

antlr

What is the best way to feed Antlr with huge numbers of tokens? Say we have a list of 100,000 English verbs, how could we add them to our grammar? We could of cause include a huge grammar file like verbs.g, but maybe there is a more elegant way, by modifying a .token file etc?

grammar verbs;

VERBS:
'eat' |
'drink' |
'sit' |
...
...
| 'sleep'
;

Also should the tokens rather be lexer or parser tokens, ie VERBS: or verbs: ? Probably VERBS:.

like image 343
Team Pannous Avatar asked Nov 05 '22 06:11

Team Pannous


1 Answers

I rather would use semantic predicates.

For this you have to define a token

word : [a-z]+

and at every site you want to use a verb (instead of a generic word) put a semantic predicate that checks if the parsed word is in the list of verbs.

Using recommend not to use the parser/lexer for such a task

  • each additional verb would change the grammar
  • each additional verb enlarges the generated code
  • conjugation is easier
  • upper/lower case could be handled easier
like image 95
CoronA Avatar answered Nov 11 '22 14:11

CoronA