Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't I create a file object from a network datastream

I'm downloading a tarfile from a REST API, writing it to a local file, then extracting the contents locally. Here's my code:

with open ('output.tar.gz', 'wb') as f:
    f.write(o._retrieve_data_stream(p).read())
with open ('output.tar.gz', 'rb') as f:
    t = tarfile.open(fileobj=f)
    t.extractall()

o._retrieve_data_stream(p) retrieves the datastream for the file.

This code works fine, but it seems unncessarily complicated to me. I think I should be able to read the bytestream directly into the fileobject read by the tarfile. Something like this:

with open(o._retrieve_data_stream(p).read(), 'rb') as f:
    t = tarfile.open(fileobj=f)
    t.extractall()

I realize that my syntax may be a little shaky there, but I think it communicates what I'm trying to do.

But when I do this, I get an encoding error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

What's going on?

like image 668
Canadian_Marine Avatar asked Sep 16 '25 20:09

Canadian_Marine


1 Answers

Posting because I solved it while I was writing this. Turns out I needed to use a BytesIO object.

This code works as expected:

from io import BytesIO

t = tarfile.open(fileobj=BytesIO(o._retrieve_data_stream(p).read()))
t.extractall()
like image 167
Canadian_Marine Avatar answered Sep 19 '25 12:09

Canadian_Marine