Here is some samplecode to explain:
outputText=""
counter=0
for obj in specialObjects:
if (obj.id < 400) or (obj.name.startswith("he")) or (obj.deliberateBreak==True):
print "The object %s is causing a section break."%obj.details
outputText = outputText.rjust(80)
open("file%d.txt"%counter,"w").write(outputText)
outputText=""
outputText+=obj.shortValue()
# THIS CODE IS DUPLICATED
outputText = outputText.rjust(80)
open("file%d.txt"%counter,"w").write(outputText)
What I need to do is iterate over a list of these special objects and check a few different conditions each time. If any of the conditions are met (as seen here) then I need to take the current output buffer, write it to a file, then start a new output buffer and keep processing.
The problem here is the code duplication. Notice how the two lines (outputText= and open) are duplicated. If I fail to put in the second set of lines, the last set of objects will be processed but their output will never be written.
I can think of two possible solutions to prevent the code duplication. Both of them seem slightly inelegant, so I was wondering if there was an even better way.
1) Wrap the code that would be repeated in a function.
outputText=""
counter=0
for obj in specialObjects:
if (obj.id < 400) or (obj.name.startswith("he")) or (obj.deliberateBreak==True):
print "The object %s is causing a section break."%obj.details
counter = writeData(outputText)
outputText=""
outputText+=obj.shortValue()
writeData(outputText,counter)
def writeData(outputText,counter):
outputText = outputText.rjust(80)
open("file%d.txt"%counter,"w").write(outputText)
return counter+1
2) Use a numeric for loop instead, and count to one higher than the length of the object list; use that value as a flag to mean "write, but now exit":
outputText=""
counter=0
for obj in range(len(specialObjects))+1:
if (obj = len(specialObjects)) or (specialObjects[obj].id < 400) or (specialObjects[obj].name.startswith("he")) or (specialOejcts[obj].deliberateBreak==True):
print "The object %s is causing a section break."%specialObjects[obj].details
outputText = outputText.rjust(80)
open("file%d.txt"%counter,"w").write(outputText)
outputText=""
if (obj==len(specialObjects)):
break
outputText+=specialObjects[obj].shortValue()
If I had to choose one, I'd probably pick #2, but this could end up creating some weird edge cases with the 'if' statement if any more complex boolean logic ever needs to be used.
Is there an even cleaner or more "Pythonic" way to accomplish this without code duplication?
Thanks!
Iterator in Python is an object that is used to iterate over iterable objects like lists, tuples, dicts, and sets. The iterator object is initialized using the iter() method. It uses the next() method for iteration. __next__(): The next method returns the next value for the iterable.
You can use the for in loop to loop through all the elements of an array.
You can loop through the list items by using a while loop. Use the len() function to determine the length of the list, then start at 0 and loop your way through the list items by referring to their indexes. Remember to increase the index by 1 after each iteration.
When I find myself writing code like this, where I'm iterating over a collection and repeating code after the end of the loop, I usually take it as a sign that I'm not iterating over the right thing.
In this case, you're iterating over a list of objects. But what you really want to iterate over, I think, is a list of groups of objects. That's what itertools.groupby
is useful for.
Your code has a lot going on, so I'm going to use a simplified example to illustrate how you can get rid of that duplicate code. Say, for (a very contrived) example, that I have a list of things like this:
things = ["apples", "oranges", "pears", None,
"potatoes", "tomatoes", None,
"oatmeal", "eggs"]
This is a list of objects. Looking carefully, there are several groups of objects delimited by None
(note that you'd typically represent things
as a nested list, but let's ignore that for the purpose of the example). My goal is to print out each group on a separate line:
apples, oranges, pears
potatoes, tomatoes
oatmeal, eggs
Here's the "ugly" way of doing this:
current_things = []
for thing in things:
if thing is None:
print ", ".join(current_things)
current_things = []
else:
current_things.append(thing)
print ", ".join(current_things)
As you can see, we have that duplicated print
after the loop. Nasty!
Here's the solution using groupby
:
from itertools import groupby
for key, group in groupby(things, key=lambda x: x is not None):
if key:
print ", ".join(group)
groupby
takes an iterable (things
) and a key function. It looks at each element of the iterable and applies the key function. When the key changes value, a new group is formed. The result is an iterator that returns (key, group)
pairs.
In this case, we'll use the check for None
to be our key function. That's why we need the if key:
, since there will be groups of size one corresponding to the None
elements of our list. We'll just skip those.
As you can see, groupby
allows us to iterate over the things we really want to iterate over: groups of objects. This is more natural for our problem, and the code simplifies as a result. It looks like your code is very similar to the above example, except that your key function will check the various properties of the object (obj.id < 400 ...
). I'll leave the implementation details up to you...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With