I'm trying to rewrite tokenizer.py for Java so I can parse Python in Java, but I don't understand the difference between NL
and NEWLINE
in the source. They seem to mean the same thing, but if they did then where are there two tokens?
Some googling provided this answer:
Token value used to indicate a non-terminating newline. The NEWLINE token indicates the end of a logical line of Python code; NL tokens are generated when a logical line of code is continued over multiple physical lines.
as stated from here:
https://docs.python.org/2/library/tokenize.html
and more in depth information can be found here:
Python 2 newline tokens in tokenize module
In addition to marsh's answer, if you look into the code, you can see that there is a difference in line 577 (others NL
occurrences being in (NEWLINE, NL)
):
yield TokenInfo(NL if parenlev > 0 else NEWLINE,
token, spos, epos, line)
where parenlev
keeps track of parenthesis' level:
if initial in '([{':
parenlev += 1
elif initial in ')]}':
parenlev -= 1
so NEWLINE
indicates the end of the "statement", and NL
the end of the line, but not the statement.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With