Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx with variable data in it - ply.lex

Tags:

python

lexer

ply

im using the python module ply.lex to write a lexer. I got some of my tokens specified with regular expression but now im stuck. I've a list of Keywords who should be a token. data is a list with about 1000 Keywords which should be all recognised as one sort of Keyword. This can be for example: _Function1 _UDFType2 and so on. All words in the list are separated by whitespaces thats it. I just want that lexer to recognise the words in this list, so that it would return a token of type `KEYWORD.

data = 'Keyword1 Keyword2 Keyword3 Keyword4'
def t_KEYWORD(t):
    # ... r'\$' + data ??
    return t

text = '''
Some test data


even more

$var = 2231




$[]Test this 2.31 + / &
'''

autoit = lex.lex()
autoit.input(text)
while True:
    tok = autoit.token()
    if not tok: break
    print(tok)

So i was trying to add the variable to that regex, but it didnt work. I'm always gettin: No regular expression defined for rule 't_KEYWORD'.

Thank you in advance! John

like image 618
Sean M. Avatar asked Mar 04 '26 20:03

Sean M.


1 Answers

As @DSM suggests you can use the TOKEN decorator. The regular expression to find tokens like cat or dog is 'cat|dog' (that is, words separated by '|' rather than a space). So try:

from ply.lex import TOKEN
data = data.split() #make data a list of keywords

@TOKEN('|'.join(data))
def t_KEYWORD(t):
    return t
like image 114
Andy Hayden Avatar answered Mar 07 '26 09:03

Andy Hayden



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!