Taken from the documentation, the following is a snippet showing how the regex method findall works, and confirms that it does return a list.
re.findall(r"\w+ly", text)
['carefully', 'quickly']
However the following code fragment generates an out of bounds error (IndexError: list index out of range
)when trying to access the zeroth element of the list returned by findall.
Relevant Code Fragment:
population = re.findall(",([0-9]*),",line)
x = population[0]
thelist.append([city,x])
Why does this happen?
For some more background, here's how that fragment fits into my entire script:
import re
thelist = list()
with open('Raw.txt','r') as f:
for line in f:
if line[1].isdigit():
city = re.findall("\"(.*?)\s*\(",line)
population = re.findall(",([0-9]*),",line)
x = population[0]
thelist.append([city,x])
with open('Sorted.txt','w') as g:
for item in thelist:
string = item[0], ', '.join(map(str, item[1:]))
print string
EDIT: Read comment below for some background on why this happened. My quick fix was:
if population:
x = population[0]
thelist.append([city,x])
re.findall
will return an empty list if there are no matches:
>>> re.findall(r'\w+ly', 'this does not work')
[]
re.findall
can return you an empty list in the case where there was no match. If you try to access [][0]
you will see that IndexError
.
To take into account no matches, you should use something along the lines of:
match = re.findall(...)
if match:
# potato potato
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With