According to guido (and to some other Python programmers), implicit string literal concatenation is considered harmful. Thus, I am trying to identifying logical lines containing such a concatenation.
My first (and only) attempt was using shlex
; I thought of splitting a logical line with posix=False
, so I'll identify parts encapsulated by quotes, and if these lie next to each other, it will be considered "literal concatenation".
However, this fails on multiline strings, as the following example shows:
shlex.split('""" Some docstring """', posix=False)
# Returns '['""', '" Some docstring "', '""']', which is considered harmful, but it's not
I can tweak this is some weird ad-hoc ways, but I wondered whether you can think of a simple solution for this. My intention is to add it to my already extended pep8
verifier.
Interesting question, I just had to play with it and because there is no answer I'm posting my solution to the problem:
#!/usr/bin/python
import tokenize
import token
import sys
with open(sys.argv[1], 'rU') as f:
toks = list(tokenize.generate_tokens(f.readline))
for i in xrange(len(toks) - 1):
tok = toks[i]
# print tok
tok2 = toks[i + 1]
if tok[0] == token.STRING and tok[0] == tok2[0]:
print "implicit concatenation in line " \
"{} between {} and {}".format(tok[2][0], tok[1], tok2[1])
You can feed the program with itself and the result should be
implicit concatenation in line 14 between "implicit concatenation in line " and "{} between {} and {}"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With