Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Catching ConnectionResetError with Python

I'm building a Python script that searches through my database for all URLs and then follows the URLs to find broken links. This script requires using exception handling to log when it encounters an error opening a link, however it's started encountering an error that I've been completely unable to write an except statement for:

Traceback (most recent call last):
  File "exceptionerror.py", line 97, in <module>
    raw_response = response.read().decode('utf8', errors='ignore')
  File "/usr/lib/python3.4/http/client.py", line 512, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python3.4/http/client.py", line 662, in _safe_read
    chunk = self.fp.read(min(amt, MAXAMOUNT))
  File "/usr/lib/python3.4/socket.py", line 371, in readinto
    return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer

I've tried the following:

except SocketError as inst:
    brokenlinksflag = 1
    brokenlinks = articlelinks[j] + ' ' + sys.exc_info()[0] + ', ' + brokenlinks
    continue

And:

except ConnectionResetError as inst:
    brokenlinksflag = 1
    brokenlinks = articlelinks[j] + ' ' + sys.exc_info()[0] + ', ' + brokenlinks
    continue

And even a full generic exception to attempt to catch all errors just so it doesn't kill the whole script:

except:
    print("This link was not caught by defined exceptions: " + articlelinks[j])
    continue

I'm at a complete loss for how to have my script catch this error so that it can continue checking for broken links rather than hard failing. It's intermittent, so I do not believe the link is broken, and I feel that even though I've identified the URL, simply catching it and skipping it before hand is cheating since my goal is to properly handle exceptions. Could someone advise me on how to handle this exception?

For reference, here is my full loop:

for j in range(0, len(articlelinks)):
    try:
        req=urllib.request.Request(articlelinks[j], None, {'User-agent' : 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0'})
        response = urllib.request.urlopen(req)
    except urllib.request.HTTPError as inst:
        brokenlinksflag = 1
        brokenlinks = articlelinks[j] + ' ' + format(inst) + ', ' + brokenlinks
        continue
    except TimeoutError:
        brokenlinksflag = 1
        brokenlinks = articlelinks[j] + ' Timeout Error, ' + brokenlinks
        continue
    except urllib.error.URLError as inst:
        brokenlinksflag = 1
        brokenlinks = articlelinks[j] + ' ' + format(inst) + ', ' + brokenlinks
        continue
    except SocketError as inst:
        brokenlinksflag = 1
        brokenlinks = articlelinks[j] + ' ' + sys.exc_info()[0] + ', ' + brokenlinks
        continue
    except:
        print("This article killed everything: " + articlelinks[j])
        exit()
like image 376
David Scott Avatar asked Sep 02 '15 23:09

David Scott


People also ask

What is ConnectionResetError in Python?

exception ConnectionResetError. A subclass of ConnectionError , raised when a connection is reset by the peer.

What is catching an exception in Python?

An exception is an event, which occurs during the execution of a program that disrupts the normal flow of the program's instructions. In general, when a Python script encounters a situation that it cannot cope with, it raises an exception. An exception is a Python object that represents an error.

How exceptions are created in Python?

In Python, users can define custom exceptions by creating a new class. This exception class has to be derived, either directly or indirectly, from the built-in Exception class. Most of the built-in exceptions are also derived from this class.

What is value error Python?

1. What is Python ValueError? Python ValueError is raised when a function receives an argument of the correct type but an inappropriate value. Also, the situation should not be described by a more precise exception such as IndexError.


1 Answers

Solved! The issue is that that I was troubleshooting the connection to handle the ConnectionResetError, however, more careful examination of the full error indicated that the error was thrown by trying to process the response rather than opening the url:

  File "exceptionerror.py", line 97, in <module>
    raw_response = response.read().decode('utf8', errors='ignore')

Because the connection was reset, rather than completely terminated, the script was able to successfully open the URL, and the error was generated when trying to decode the response, meaning that the try/except conditions were around the wrong lines.

The following resolved the issue:

try:
    raw_response = response.read().decode('utf8', errors='ignore')
except ConnectionResetError:
    brokenlinksflag = 1
    brokenlinks = articlelinks[j] + ' ConnectionResetError, ' + brokenlinks
    continue
like image 152
David Scott Avatar answered Oct 29 '22 03:10

David Scott