Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get the file size from a link without downloading it in python?

Tags:

python

get

I have a list of links that I am trying to get the size of to determine how much computational resources each file need. Is it possible to just get the file size with a get request or something similar?

Here is an example of one of the links: https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887

Thanks

like image 471
Joe B Avatar asked Mar 18 '19 16:03

Joe B


People also ask

How do I get the size of a file in Python?

The python os module has stat() function where we can pass the file name as argument. This function returns a tuple structure that contains the file information. We can then get its st_size property to get the file size in bytes.

What is my file size?

Microsoft Windows usersRight-click the file and click Properties. The image below shows that you can determine the size of the file or files you have highlighted from in the file properties window. In this example, the chrome. jpg file is 18.5 KB (19,032 bytes), and that the size on disk is 20.0 KB (20,480 bytes).

How do I change the size of a file in Python?

To create a file of a particular size, just seek to the byte number(size) you want to create the file of and write a byte there.


1 Answers

To do this use the HTTP HEAD method which just grabs the header information for the URL and doesn't download the content like an HTTP GET request does.

$curl -I https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887
HTTP/1.1 200 OK
Server: nginx
Date: Mon, 18 Mar 2019 16:56:35 GMT
Content-Type: application/octet-stream
Content-Length: 578220087
Last-Modified: Tue, 21 Feb 2017 12:13:19 GMT
Connection: keep-alive
Accept-Ranges: bytes

The file size is in the 'Content-Length' header. In Python 3.6:

>>> import urllib
>>> req = urllib.request.Request('https://sra-download.ncbi.nlm.nih.gov/traces/sra46/SRR/005150/SRR5273887', 
                                 method='HEAD')
>>> f = urllib.request.urlopen(req)
>>> f.status
200
>>> f.headers['Content-Length']
'578220087'
like image 58
Steven Graham Avatar answered Sep 28 '22 16:09

Steven Graham