I just read a bunch of posts on how to handle the StopIteration error in Python, I had trouble solving my particular example, though. Basically, I have a csv file with a lot of prefixes. This file has two columns with headers: Word and Count. Count is the frequency with which that prefix occurs. I also have another file with a list of company names. The prefix file acquired the prefixes from the first word of each company name in the company file. I'm trying to remove duplicates, and what I want to do right now is :
Ignore the StopIteration error every time this error would occur.
In order words, instead of having to write all the commented out "if" statements below, I just want one line that says: if a StopIteration error is generated, simply ignore the error is some way by treating the problematic "prefix" as if it were a prefix which occurs more than twice in the prefix file, such that we should return the value of the company name without the prefix included. I realize that this ignores the fact that there is a different prefix value in the prefix file and the actual prefix of the company name, but usually it has to do with non-American English letters stored differently between python and excel, and a few other ways that don't seem particularly systematic so I'll just remove them manually later.
My code is:
def remove_prefix(prefix, first_name):
#try:
#EXCEPTIONS:
#if '(' in prefix:
# prefix = prefix[1:]
#if ')' in prefix:
# prefix = prefix[:-1]
"""
if prefix == "2-10":
prefix = "2"
if prefix == "4:2:2":
prefix = "4"
if prefix == "5/0" or prefix == "5/7" or prefix == "58921-":
prefix = "5"
"""
#except StopIteration:
# pass
print(first_name, prefix)
input_fields = ('Word', 'Count')
reader = csv.DictReader(infile1, fieldnames = input_fields)
#if the prefix has a frequency of x >=2 in the prefix file, then return first_name without prefix
#else, return first_Name
infile1.seek(0)
#print(infile1.seek(0))
next(reader)
first_row = next(reader)
while prefix != first_row['Word'] and prefix[1:]!= first_row['Word']:
first_row = next(reader)
#print(first_name, prefix)
#print(first_row, first_name, prefix, '\t' + first_row['Word'], prefix[1:])
if first_row['Count'] >= 2:
length = len(prefix)
first_name = first_name[length+1:]
#print("first name is ", first_name)
return first_name
I don't think this is caused by what you think it is caused by. The StopIteration exception is caused when the generator (reader) runs out of lines to read.
For example:
def g():
"generates 1 (once)"
yield 1
a = g()
next(a) # is 1
next(a) # StopIteration exception (nothing left to yield)
To fix this you can wrap the next in a try, except (pass):
while prefix != first_row['Word'] and prefix[1:]!= first_row['Word']:
try:
first_row = next(reader)
except StopIteration:
pass
However, as David points out, this is probably not the way you ought to be going about this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With