I wrote a crawler to fetch information out of an Q&A website. Since not all the fields are presented in a page all the time, I used multiple try-excepts to handle the situation.
def answerContentExtractor( loginSession, questionLinkQueue , answerContentList) :
while True:
URL = questionLinkQueue.get()
try:
response = loginSession.get(URL,timeout = MAX_WAIT_TIME)
raw_data = response.text
#These fields must exist, or something went wrong...
questionId = re.findall(REGEX,raw_data)[0]
answerId = re.findall(REGEX,raw_data)[0]
title = re.findall(REGEX,raw_data)[0]
except requests.exceptions.Timeout ,IndexError:
print >> sys.stderr, URL + " extraction error..."
questionLinkQueue.task_done()
continue
try:
questionInfo = re.findall(REGEX,raw_data)[0]
except IndexError:
questionInfo = ""
try:
answerContent = re.findall(REGEX,raw_data)[0]
except IndexError:
answerContent = ""
result = {
'questionId' : questionId,
'answerId' : answerId,
'title' : title,
'questionInfo' : questionInfo,
'answerContent': answerContent
}
answerContentList.append(result)
questionLinkQueue.task_done()
And this code, sometimes, may or may not, gives the following exception during runtime:
UnboundLocalError: local variable 'IndexError' referenced before assignment
The line number indicates the error occurs at the second except IndexError:
Thanks everyone for your suggestions, Would love to give the marks that you deserve, too bad I can only mark one as the correct answer...
The Python "UnboundLocalError: Local variable referenced before assignment" occurs when we reference a local variable before assigning a value to it in a function. To solve the error, mark the variable as global in the function definition, e.g. global my_var . Here is an example of how the error occurs.
UnboundLocalError can be solved by changing the scope of the variable which is complaining. You need to explicitly declare the variable global. Variable x's scope in function printx is global. You can verify the same by printing the value of x in terminal and it will be 6.
The UnboundLocalError: local variable referenced before assignment error is raised when you try to assign a value to a local variable before it has been declared. You can solve this error by ensuring that a local variable is declared before you assign it a value.
To catch and print an exception that occurred in a code snippet, wrap it in an indented try block, followed by the command "except Exception as e" that catches the exception and saves its error message in string variable e . You can now print the error message with "print(e)" or use it for further processing.
I think the problem is this line:
except requests.exceptions.Timeout ,IndexError
This is equivalent to:
except requests.exceptions.Timeout as IndexError:
So, you're assigning IndexError
to the exception caught by requests.exceptions.Timeout
. Error can be reproduced by this code:
try:
true
except NameError, IndexError:
print IndexError
#name 'true' is not defined
To catch multiple exceptions use a tuple:
except (requests.exceptions.Timeout, IndexError):
And UnboundLocalError
is coming because IndexError
is treated as a local variable by your function, so trying to access its value before actual definition will raise UnboundLocalError
error.
>>> 'IndexError' in answerContentExtractor.func_code.co_varnames
True
So, if this line is not executed at runtime (requests.exceptions.Timeout ,IndexError
) then the IndexError
variable used below it will raise the UnboundLocalError
. A sample code to reproduce the error:
def func():
try:
print
except NameError, IndexError:
pass
try:
[][1]
except IndexError:
pass
func()
#UnboundLocalError: local variable 'IndexError' referenced before assignment
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With