Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I read a CSV file that's Gzipped from URL - Python [duplicate]

Tags:

python

csv

gzip

I am requesting a csv file that's gzipped.

How do I uncompress that file and convert it to a csv object?

csv_gz_file = get("example.com/filename.csv.gz", headers=csv_headers, timeout=30, stream=True)

reader = csv.reader(csv_gz_file)
for row in reader:
   print row

And it throws this because it's not unzipped

_csv.Error: line contains NULL byte
like image 305
Tim Nuwin Avatar asked Jun 08 '16 14:06

Tim Nuwin


1 Answers

import gzip
import io
import requests

web_response = requests.get("example.com/filename.csv.gz", headers=csv_headers,
                            timeout=30, stream=True)
csv_gz_file = web_response.content # Content in bytes from requests.get
                                   # See comments below why this is used.

f = io.BytesIO(csv_gz_file)
with gzip.GzipFile(fileobj=f) as fh:
    # Passing a binary file to csv.reader works in PY2
    reader = csv.reader(fh)
    for row in reader:
        print(row)

By saving the gz data in memory, extract it using the gzip module and then read the plaintext data into another memory container and finally, open that container with your reader.

I'm slightly unsure on how csv.reader expects a file handle or a list of data, but I'd assume this would work. If not simply do:

reader = csv.reader(csv_content.splitlines())

And that should do the trick.

like image 90
Torxed Avatar answered Sep 22 '22 08:09

Torxed