Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Iterate through object executing code both at certain places and also at end

Here is some samplecode to explain:

outputText=""
counter=0
for obj in specialObjects:
    if (obj.id < 400) or (obj.name.startswith("he")) or (obj.deliberateBreak==True):
        print "The object %s is causing a section break."%obj.details
        outputText = outputText.rjust(80)
        open("file%d.txt"%counter,"w").write(outputText)
        outputText=""
    outputText+=obj.shortValue()
# THIS CODE IS DUPLICATED
outputText = outputText.rjust(80)
open("file%d.txt"%counter,"w").write(outputText)

What I need to do is iterate over a list of these special objects and check a few different conditions each time. If any of the conditions are met (as seen here) then I need to take the current output buffer, write it to a file, then start a new output buffer and keep processing.

The problem here is the code duplication. Notice how the two lines (outputText= and open) are duplicated. If I fail to put in the second set of lines, the last set of objects will be processed but their output will never be written.

I can think of two possible solutions to prevent the code duplication. Both of them seem slightly inelegant, so I was wondering if there was an even better way.

1) Wrap the code that would be repeated in a function.

outputText=""
counter=0
for obj in specialObjects:
    if (obj.id < 400) or (obj.name.startswith("he")) or (obj.deliberateBreak==True):
        print "The object %s is causing a section break."%obj.details
        counter = writeData(outputText)
        outputText=""
    outputText+=obj.shortValue()
writeData(outputText,counter)

def writeData(outputText,counter):
    outputText = outputText.rjust(80)
    open("file%d.txt"%counter,"w").write(outputText)
    return counter+1

2) Use a numeric for loop instead, and count to one higher than the length of the object list; use that value as a flag to mean "write, but now exit":

outputText=""
counter=0
for obj in range(len(specialObjects))+1:
    if (obj = len(specialObjects)) or (specialObjects[obj].id < 400) or (specialObjects[obj].name.startswith("he")) or (specialOejcts[obj].deliberateBreak==True):
        print "The object %s is causing a section break."%specialObjects[obj].details
        outputText = outputText.rjust(80)
        open("file%d.txt"%counter,"w").write(outputText)
        outputText=""
        if (obj==len(specialObjects)):
            break 
    outputText+=specialObjects[obj].shortValue()

If I had to choose one, I'd probably pick #2, but this could end up creating some weird edge cases with the 'if' statement if any more complex boolean logic ever needs to be used.

Is there an even cleaner or more "Pythonic" way to accomplish this without code duplication?

Thanks!

like image 828
fdmillion Avatar asked Dec 16 '14 18:12

fdmillion


People also ask

Can you iterate through an object Python?

Iterator in Python is an object that is used to iterate over iterable objects like lists, tuples, dicts, and sets. The iterator object is initialized using the iter() method. It uses the next() method for iteration. __next__(): The next method returns the next value for the iterable.

How do you iterate through an array of objects in Python?

You can use the for in loop to loop through all the elements of an array.

How does Python iterate over a list?

You can loop through the list items by using a while loop. Use the len() function to determine the length of the list, then start at 0 and loop your way through the list items by referring to their indexes. Remember to increase the index by 1 after each iteration.


1 Answers

When I find myself writing code like this, where I'm iterating over a collection and repeating code after the end of the loop, I usually take it as a sign that I'm not iterating over the right thing.

In this case, you're iterating over a list of objects. But what you really want to iterate over, I think, is a list of groups of objects. That's what itertools.groupby is useful for.

Your code has a lot going on, so I'm going to use a simplified example to illustrate how you can get rid of that duplicate code. Say, for (a very contrived) example, that I have a list of things like this:

things = ["apples", "oranges", "pears", None, 
          "potatoes", "tomatoes", None,
          "oatmeal", "eggs"]

This is a list of objects. Looking carefully, there are several groups of objects delimited by None (note that you'd typically represent things as a nested list, but let's ignore that for the purpose of the example). My goal is to print out each group on a separate line:

apples, oranges, pears
potatoes, tomatoes
oatmeal, eggs

Here's the "ugly" way of doing this:

current_things = []
for thing in things:
    if thing is None:
        print ", ".join(current_things)
        current_things = []
    else:
        current_things.append(thing)

print ", ".join(current_things)

As you can see, we have that duplicated print after the loop. Nasty!

Here's the solution using groupby:

from itertools import groupby

for key, group in groupby(things, key=lambda x: x is not None):
    if key:
        print ", ".join(group)

groupby takes an iterable (things) and a key function. It looks at each element of the iterable and applies the key function. When the key changes value, a new group is formed. The result is an iterator that returns (key, group) pairs.

In this case, we'll use the check for None to be our key function. That's why we need the if key:, since there will be groups of size one corresponding to the None elements of our list. We'll just skip those.

As you can see, groupby allows us to iterate over the things we really want to iterate over: groups of objects. This is more natural for our problem, and the code simplifies as a result. It looks like your code is very similar to the above example, except that your key function will check the various properties of the object (obj.id < 400 ...). I'll leave the implementation details up to you...

like image 117
jme Avatar answered Sep 18 '22 04:09

jme