Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reading gzipped csv file in python 3

Tags:

python

csv

gzip

I'm having problems reading from a gzipped csv file with the gzip and csv libs. Here's what I got:

import gzip
import csv
import json

f = gzip.open(filename)
csvobj = csv.reader(f,delimiter = ',',quotechar="'")
for line in csvobj:
            ts = line[0]
            data_json = json.loads(line[1])

but this throws an exception:

 File "C:\Users\yaronol\workspace\raw_data_from_s3\s3_data_parser.py", line 64, in download_from_S3
    self.parse_dump_file(filename)
  File "C:\Users\yaronol\workspace\raw_data_from_s3\s3_data_parser.py", line 30, in parse_dump_file
    for line in csvobj:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

gunzipping the file and opening that with csv works fine. I've also tried decoding the file text to convert from bytes to str...

What am I missing here?

like image 646
WeaselFox Avatar asked May 19 '15 11:05

WeaselFox


1 Answers

Default mode for gzip.open is rb, if you wish to work with strs, you have to specify it extra:

f = gzip.open(filename, mode="rt")

OT: it is a good practice to write I/O operations in a with block:

with gzip.open(filename, mode="rt") as f:
like image 63
pacholik Avatar answered Oct 24 '22 13:10

pacholik