In Python, I really enjoy how concise an implementation can be when using list comprehension. I love to do concise list comprehensions this:
myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = [x for x in myList if x > 10]
However, I often encounter more verbose implementations like this:
myList = [1, 5, 11, 20, 30, 35] #input data
bigNumbers = []
for i in xrange(0, len(myList)):
    if myList[i] > 10:
        bigNumbers.append(myList[i])
When a for loop only looks through one data structure (e.g. myList[]), there is usually a straightforward list comprehension statement that is equivalent to the loop. 
With this in mind, is there a refactoring tool that converts verbose Python loops into concise list comprehension statements?
Previous StackOverflow questions have asked for advice on transforming loops into list comprehension. But, I have yet to find a question about automatically converting loops into list comprehension expressions.
Motivation: There are numerous ways to answer the question "what does it mean for code to be clean?" Personally, I find that making code concise and getting rid of some of the fluff tends to make code cleaner and more readable. Naturally there's a line in the sand between "concise code" and "incomprehensible one-liners." Still, I often find it satisfying to write and work with concise code.
2to3 is a refactoring tool that can perform arbitrary refactorings, as long as you can specify them with a syntactical pattern. The pattern you might want to look for is this
VARIABLE1 = []
for VARIABLE2 in EXPRESSION1:
    if EXPRESSION2:
        VARIABLE1.append(EXPRESSION3)
This can be refactored safely to
VARIABLE1 = [EXPRESSION3 for VARIABLE2 in EXPRESSION1 if EXPRESSION2]
In your specific example, this would give
bigNumbers = [myList[i] for i in xrange(0, len(myList)) if myList[i] > 10]
Then, you can have another refactoring that replaces xrange(0, N) with xrange(N), and another one that replaces
[VARIABLE1[VARIABLE2] for VARIABLE2 in xrange(len(VARIABLE1)) if EXPRESSION1]
with
[VARIABLE3 for VARIABLE3 in VARIABLE1 if EXPRESSION1PRIME]
There are several problems with this refactoring:
EXPRESSION1PRIME must be EXPRESSION1 with all occurrences of 
VARIABLE1[VARIABLE2] replaced by VARIABLE3. This is possible with
2to3, but requires explicit code to do the traversal and replacement.EXPRESSION1PRIME then must not contain no further occurrences of
VARIABLE1. This can also be checked with explicit code.x;
there is no reasonable way to have this done automatically. You could
chose to recycle VARIABLE1 (i.e. i) for that, but that may be confusing
as it suggests that i is still an index. It might work to pick a synthetic
name, such as VARIABLE1_VARIABLE2 (i.e. myList_i), and check whether
that's not used otherwise.iter(VARIABLE1). It's not possible to do this automatically.If you want to learn how to write 2to3 fixers, take a look at Lennart Regebro's book.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With