Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if a large file exists without downloading it

Not sure if this is possible, but I would like to check the status code of an HTTP request to a large file without downloading it; I just want to check if it's present on the server.

Is it possible to do this with Python's requests? I already know how to check the status code but I can only do that after the file has been downloaded.

I guess what I'm asking is can you issue a GET request and stop it as soon as you've receive the response headers?

like image 497
Juicy Avatar asked Jan 09 '17 10:01

Juicy


2 Answers

Use requests.head(). This only returns the header of requests, not all content — in other words, it will not return the body of a message, but you can get all the information from the header.

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

For example:

import requests
url = 'http://lmsotfy.com/so.png'
r = requests.head(url)
r.headers

Output:

{'Content-Type': 'image/png', 'Content-Length': '6347', 'ETag': '"18cb-4f7c2f94011da"', 'Accept-Ranges': 'bytes', 'Date': 'Mon, 09 Jan 2017 11:23:53 GMT', 'Last-Modified': 'Thu, 24 Apr 2014 05:18:04 GMT', 'Server': 'Apache', 'Keep-Alive': 'timeout=2, max=100', 'Connection': 'Keep-Alive'}

This code does not download the picture, but returns the header of the picture message, which contains the size, type and date. If the picture does not exist, there will be no such information.

like image 140
宏杰李 Avatar answered Oct 11 '22 22:10

宏杰李


Normally, you use HEAD method instead of GET for such sort of things. If you query some random server on the web, then be prepared that it may be configured to return inconsistent results (this is typical for servers requiring registration). In such cases you may want to use GET request with Range header to download only small number of bytes.

like image 34
gudok Avatar answered Oct 11 '22 23:10

gudok