Python: Iterate through object executing code both at certain places and also at end

Tags:

Here is some samplecode to explain:

outputText=""
counter=0
for obj in specialObjects:
    if (obj.id < 400) or (obj.name.startswith("he")) or (obj.deliberateBreak==True):
        print "The object %s is causing a section break."%obj.details
        outputText = outputText.rjust(80)
        open("file%d.txt"%counter,"w").write(outputText)
        outputText=""
    outputText+=obj.shortValue()
# THIS CODE IS DUPLICATED
outputText = outputText.rjust(80)
open("file%d.txt"%counter,"w").write(outputText)

What I need to do is iterate over a list of these special objects and check a few different conditions each time. If any of the conditions are met (as seen here) then I need to take the current output buffer, write it to a file, then start a new output buffer and keep processing.

The problem here is the code duplication. Notice how the two lines (outputText= and open) are duplicated. If I fail to put in the second set of lines, the last set of objects will be processed but their output will never be written.

I can think of two possible solutions to prevent the code duplication. Both of them seem slightly inelegant, so I was wondering if there was an even better way.

1) Wrap the code that would be repeated in a function.

outputText=""
counter=0
for obj in specialObjects:
    if (obj.id < 400) or (obj.name.startswith("he")) or (obj.deliberateBreak==True):
        print "The object %s is causing a section break."%obj.details
        counter = writeData(outputText)
        outputText=""
    outputText+=obj.shortValue()
writeData(outputText,counter)

def writeData(outputText,counter):
    outputText = outputText.rjust(80)
    open("file%d.txt"%counter,"w").write(outputText)
    return counter+1

2) Use a numeric for loop instead, and count to one higher than the length of the object list; use that value as a flag to mean "write, but now exit":

outputText=""
counter=0
for obj in range(len(specialObjects))+1:
    if (obj = len(specialObjects)) or (specialObjects[obj].id < 400) or (specialObjects[obj].name.startswith("he")) or (specialOejcts[obj].deliberateBreak==True):
        print "The object %s is causing a section break."%specialObjects[obj].details
        outputText = outputText.rjust(80)
        open("file%d.txt"%counter,"w").write(outputText)
        outputText=""
        if (obj==len(specialObjects)):
            break 
    outputText+=specialObjects[obj].shortValue()

If I had to choose one, I'd probably pick #2, but this could end up creating some weird edge cases with the 'if' statement if any more complex boolean logic ever needs to be used.

Is there an even cleaner or more "Pythonic" way to accomplish this without code duplication?

Thanks!

828

asked Dec 16 '14 18:12

fdmillion

1 Answers

When I find myself writing code like this, where I'm iterating over a collection and repeating code after the end of the loop, I usually take it as a sign that I'm not iterating over the right thing.

In this case, you're iterating over a list of objects. But what you really want to iterate over, I think, is a list of groups of objects. That's what itertools.groupby is useful for.

Your code has a lot going on, so I'm going to use a simplified example to illustrate how you can get rid of that duplicate code. Say, for (a very contrived) example, that I have a list of things like this:

things = ["apples", "oranges", "pears", None, 
          "potatoes", "tomatoes", None,
          "oatmeal", "eggs"]

This is a list of objects. Looking carefully, there are several groups of objects delimited by None (note that you'd typically represent things as a nested list, but let's ignore that for the purpose of the example). My goal is to print out each group on a separate line:

apples, oranges, pears
potatoes, tomatoes
oatmeal, eggs

Here's the "ugly" way of doing this:

current_things = []
for thing in things:
    if thing is None:
        print ", ".join(current_things)
        current_things = []
    else:
        current_things.append(thing)

print ", ".join(current_things)

As you can see, we have that duplicated print after the loop. Nasty!

Here's the solution using groupby:

from itertools import groupby

for key, group in groupby(things, key=lambda x: x is not None):
    if key:
        print ", ".join(group)

groupby takes an iterable (things) and a key function. It looks at each element of the iterable and applies the key function. When the key changes value, a new group is formed. The result is an iterator that returns (key, group) pairs.

In this case, we'll use the check for None to be our key function. That's why we need the if key:, since there will be groups of size one corresponding to the None elements of our list. We'll just skip those.

As you can see, groupby allows us to iterate over the things we really want to iterate over: groups of objects. This is more natural for our problem, and the code simplifies as a result. It looks like your code is very similar to the above example, except that your key function will check the various properties of the object (obj.id < 400 ...). I'll leave the implementation details up to you...

117

answered Sep 18 '22 04:09

jme

Related questions
                            
                                Python on Appengine using BeautifulSoup ImportError: No module named bs4
                            
                                python 5x slower than perl mySql query
                            
                                How can I make PyCharm's Fill Paragraph command override the default margin?
                            
                                Trying to grow a nested dictionary by adding more key:value pairs
                            
                                Handling Pandas dataframe columns with mixed date formats
                            
                                how to divide rdd data into two in spark?
                            
                                How to fit an ARMAX model using statsmodels
                            
                                Exhaust a list of generators in a cyclic fashion
                            
                                Caching Flask-Login user_loader
                            
                                Get app config from database entries
                            
                                Maintain updated file cache of web pages in Python?
                            
                                Install sympy for python34 in windows 8
                            
                                Does `try... except Exception as e` catch every possible exception?
                            
                                Lazy-loading variables using overloaded decorators
                            
                                django tinymce not showing rich textarea
                            
                                Change Timezone for Date object Python
                            
                                create multiple objects using writable nested serializers in django rest framework 3
                            
                                Catching exceptions that don't inherit from Exception
                            
                                Python cmd on linux does not autocomplete special characters or symbols
                            
                                Force repr() to use single quotes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: Iterate through object executing code both at certain places and also at end

Tags:

python

list

code-duplication

fdmillion

People also ask

1 Answers

jme

Recent Activity

Donate For Us