Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Regex findall returns a list, why does trying to access the list element [0] return an error?

Tags:

python

regex

Taken from the documentation, the following is a snippet showing how the regex method findall works, and confirms that it does return a list.

re.findall(r"\w+ly", text)
['carefully', 'quickly']

However the following code fragment generates an out of bounds error (IndexError: list index out of range)when trying to access the zeroth element of the list returned by findall.

Relevant Code Fragment:

population = re.findall(",([0-9]*),",line)
x = population[0]
thelist.append([city,x])

Why does this happen?

For some more background, here's how that fragment fits into my entire script:

import re

thelist = list()
with open('Raw.txt','r') as f:
    for line in f:
        if line[1].isdigit():
            city = re.findall("\"(.*?)\s*\(",line)
            population = re.findall(",([0-9]*),",line)
            x = population[0]
            thelist.append([city,x])

with open('Sorted.txt','w') as g:
    for item in thelist:
        string = item[0], ', '.join(map(str, item[1:]))
        print string

EDIT: Read comment below for some background on why this happened. My quick fix was:

if population: 
        x = population[0]
        thelist.append([city,x])
like image 892
Louis93 Avatar asked Feb 21 '13 00:02

Louis93


2 Answers

re.findall will return an empty list if there are no matches:

>>> re.findall(r'\w+ly', 'this does not work')
[]
like image 177
nneonneo Avatar answered Sep 26 '22 05:09

nneonneo


re.findall can return you an empty list in the case where there was no match. If you try to access [][0] you will see that IndexError.

To take into account no matches, you should use something along the lines of:

match = re.findall(...)
if match:
  # potato potato
like image 21
wim Avatar answered Sep 26 '22 05:09

wim