In SLY there is an example for writing a calculator (reproduced from calc.py
here):
from sly import Lexer
class CalcLexer(Lexer):
tokens = { NAME, NUMBER }
ignore = ' \t'
literals = { '=', '+', '-', '*', '/', '(', ')' }
# Tokens
NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
@_(r'\d+')
def NUMBER(self, t):
t.value = int(t.value)
return t
@_(r'\n+')
def newline(self, t):
self.lineno += t.value.count('\n')
def error(self, t):
print("Illegal character '%s'" % t.value[0])
self.index += 1
It looks like it's bugged because NAME
and NUMBER
are used before they've been defined. But actually, there is no NameError
, and this code executes fine. How does that work? When can you reference a name before it's been defined?
Python knows four kinds of direct name lookup: builtins / program global, module global, function/closure body, and class body. The NAME
, NUMBER
are resolved in a class body, and as such subject to the rules of this kind of scope.
The class body is evaluated in a namespace provided by the metaclass, which can implement arbitrary semantics for name lookups. In specific, the sly Lexer
is a LexerMeta
class using a LexerMetaDict
as the namespace; this namespace creates new tokens for undefined names.
class LexerMetaDict(dict):
...
def __getitem__(self, key):
if key not in self and key.split('ignore_')[-1].isupper() and key[:1] != '_':
return TokenStr(key, key, self.remap)
else:
return super().__getitem__(key)
The LexerMeta
is also responsible for adding the _
function to the namespace so that it can be used without imports.
class LexerMeta(type):
'''
Metaclass for collecting lexing rules
'''
@classmethod
def __prepare__(meta, name, bases):
d = LexerMetaDict()
def _(pattern, *extra):
...
d['_'] = _
d['before'] = _Before
return d
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With