Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting http header with python (getting 405)

I'm trying to create a basic link checker in python.

When using the following code:

def get_link_response_code(link_to_check):  
    resp = requests.get(link_to_check)
    return resp.status_code

I'm always getting the right response code but it takes considerable ammount of time.

But when using this code: (requests.get replaced with requests.head)

def get_link_response_code(link_to_check):  
    resp = requests.head(link_to_check)
    return resp.status_code

It usually works, and very fast, but sometimes return HTTP 405 (for a link which is not really broken).

Why am I getting 405 (wrong method) errors? what can I do to quickly check for broken links? Thanks.

like image 628
tomermes Avatar asked Jan 04 '15 07:01

tomermes


2 Answers

According to the specification, 405 means that Method not allowed which means that you cannot use HEAD for this particular resource.

Handle it and use get() in these cases:

def get_link_response_code(link_to_check):
    resp = requests.head(link_to_check)
    if resp.status_code == 405:
        resp = requests.get(link_to_check)
    return resp.status_code

As a side note, you may not need to make an additional get() since 405 is kind of a "good" error - the resource exists, but not available with HEAD. You may also check the Allow response header value which must be set in response from your HEAD request:

The Allow entity-header field lists the set of methods supported by the resource identified by the Request-URI. The purpose of this field is strictly to inform the recipient of valid methods associated with the resource. An Allow header field MUST be present in a 405 (Method Not Allowed) response.

like image 200
alecxe Avatar answered Oct 18 '22 01:10

alecxe


For requests.get your are getting the info correctly because the GET method means retrieve whatever information (in the form of an entity) is identified by the Request-URI while the requests.Head the server doesn't return message body the in the response.

Please note that the HEAD method is identical to GET except that the server MUST NOT return a message-body in the response.

like image 43
Seroney Avatar answered Oct 18 '22 03:10

Seroney