I am downloading a file using requests
:
import requests
req = requests.get(url, stream=True)
with open(local_filename, 'wb') as f:
for chunk in req.iter_content(chunk_size=1024):
if chunk:
f.write(chunk)
f.flush()
The problem with gzip files is that they being automatically decoded by requests, hence i get the unpacked file on disk, while i need the original file.
Is there a way to tell requests not to do this?
Just use zcat to see content without extraction. From the manual: zcat is identical to gunzip -c . (On some systems, zcat may be installed as gzcat to preserve the original link to compress .)
import requests
r = requests.get(url, stream=True)
with open(local_filename, 'wb') as f:
for chunk in r.raw.stream(1024, decode_content=False):
if chunk:
f.write(chunk)
This way, you will avoid automatic decompress of gzip-encoded response, save it to file as it's received from web server, chunk by chunk.
As discussed in the comments above, this seems to have solved the issue:
From the docs for the requests
module:
Requests automatically decompresses gzip-encoded responses ... You can get direct access to the raw response (and even the socket), if needed as well.
Searching the docs for "raw responses" yields requests.Response.raw
, which gives a file
-like representation of the raw response stream.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With