Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use python csv.DictReader with a binary file? (For a babel custom extraction method)

I'm trying to write a custom extraction method for babel, to extract strings from a specific column in a csv file. I followed the documentation here.

Here is my extraction method code:

def extract_csv(fileobj, keywords, comment_tags, options):
    import csv
    reader = csv.DictReader(fileobj, delimiter=',')
    for row in reader:
        if row and row['caption'] != '':
            yield (reader.line_num, '', row['caption'], '')

When i try to run the extraction i get this error:

File "/Users/tiagosilva/repos/naltio/csv_extractor.py", line 18, in extract_csv for row in reader: File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 111, in next self.fieldnames File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/csv.py", line 98, in fieldnames self._fieldnames = next(self.reader) _csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

It seems the fileobj that is passed to the function was opened in binary mode.

How to make this work? I can think of 2 possible solutions, but I don't know how to code them:

1) is there a way to use it with DictReader?

2) Is there a way to signal babel to open the file in text mode?

I'm open to other non listed solutions.

like image 926
tiagosilva Avatar asked Jul 03 '18 10:07

tiagosilva


People also ask

How does csv DictReader work in Python?

Python CSV DictReaderThe csv. DictReader class operates like a regular reader but maps the information read into a dictionary. The keys for the dictionary can be passed in with the fieldnames parameter or inferred from the first row of the CSV file. The first line of the file consists of dictionary keys.

What does csv DictReader return?

A cvs. DictReader returns an iterator that produces each row as needed. To get all of the rows into a list, an iterator can be wrapped with list() to creat a list . In this case, all the data goes into the list rows .

Which module can read data from a comma separated values .CSV file into Python dictionaries for each row?

DictReader() The objects of a csv. DictReader() class can be used to read a CSV file as a dictionary.


1 Answers

I actually found a way to do it!

It's solution 1, a way to handle a binary file. The solution is to wrap a TextIOWrapper around the binary file and decode it and pass that to the DictReader.

import csv
import io

with io.TextIOWrapper(fileobj, encoding='utf-8') as text_file:
    reader = csv.DictReader(text_file, delimiter=',')

    for row in reader:
        if row and 'caption' in row.keys():
            yield (reader.line_num, '', row['caption'], '')
like image 124
tiagosilva Avatar answered Sep 29 '22 17:09

tiagosilva