According to this answer I can use the Range header to download only a part of an html page, but with this code:
import requests
url = "http://stackoverflow.com"
headers = {"Range": "bytes=0-100"} # first 100 bytes
r = requests.get(url, headers=headers)
print r.text
I get the whole html page. Why isn't it working?
An HTTP range request asks the server to send only a portion of an HTTP message back to a client. Range requests are useful for clients like media players that support random access, data tools that know they need only part of a large file, and download managers that let the user pause and resume the download.
The Content-Range response HTTP header indicates where in a full body message a partial message belongs.
The Accept-Ranges HTTP response header is a marker used by the server to advertise its support for partial requests from the client for file downloads. The value of this field indicates the unit that can be used to define a range.
Byte-range requests occur when a client asks the server for only a portion of the requested file. The purpose of this is essentially to conserve bandwidth usage by avoiding the need to download a complete file when all that is required is a small section.
If the webserver does not support Range
header, it will be ignored.
Try with other server that support the header, for example tools.ietf.org
:
import requests
url = "http://tools.ietf.org/rfc/rfc2822.txt"
headers = {"Range": "bytes=0-100"}
r = requests.get(url, headers=headers)
assert len(r.text) <= 101 # not exactly 101, because r.text does not include header
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With