Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to open an unicode text file inside a zip?

I tried

with zipfile.ZipFile("5.csv.zip", "r") as zfile:
    for name in zfile.namelist():
        with zfile.open(name, 'rU') as readFile:
                line = readFile.readline()
                print(line)
                split = line.split('\t')

it answers:

b'$0.0\t1822\t1\t1\t1\n'
Traceback (most recent call last)
File "zip.py", line 6
    split = line.split('\t')
TypeError: Type str doesn't support the buffer API

How to open the text file as unicode instead of as b?

like image 931
Jader Dias Avatar asked Dec 16 '13 00:12

Jader Dias


1 Answers

To convert a byte stream into Unicode stream, you could use io.TextIOWrapper():

encoding = 'utf-8'
with zipfile.ZipFile("5.csv.zip") as zfile:
    for name in zfile.namelist():
        with zfile.open(name) as readfile:
            for line in io.TextIOWrapper(readfile, encoding):
                print(repr(line))

Note: TextIOWrapper() uses universal newline mode by default. rU mode in zfile.open() is deprecated since version 3.4.

It avoids issues with multibyte encodings described in @Peter DeGlopper's answer.

like image 109
jfs Avatar answered Sep 19 '22 16:09

jfs