I'm downloading a tarfile from a REST API, writing it to a local file, then extracting the contents locally. Here's my code:
with open ('output.tar.gz', 'wb') as f:
f.write(o._retrieve_data_stream(p).read())
with open ('output.tar.gz', 'rb') as f:
t = tarfile.open(fileobj=f)
t.extractall()
o._retrieve_data_stream(p)
retrieves the datastream for the file.
This code works fine, but it seems unncessarily complicated to me. I think I should be able to read the bytestream directly into the fileobject read by the tarfile. Something like this:
with open(o._retrieve_data_stream(p).read(), 'rb') as f:
t = tarfile.open(fileobj=f)
t.extractall()
I realize that my syntax may be a little shaky there, but I think it communicates what I'm trying to do.
But when I do this, I get an encoding error:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
What's going on?
Posting because I solved it while I was writing this. Turns out I needed to use a BytesIO
object.
This code works as expected:
from io import BytesIO
t = tarfile.open(fileobj=BytesIO(o._retrieve_data_stream(p).read()))
t.extractall()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With