Python CSV DictReader with UTF-8 data

Tags:

AFAIK, the Python (v2.6) csv module can't handle unicode data by default, correct? In the Python docs there's an example on how to read from a UTF-8 encoded file. But this example only returns the CSV rows as a list. I'd like to access the row columns by name as it is done by csv.DictReader but with UTF-8 encoded CSV input file.

Can anyone tell me how to do this in an efficient way? I will have to process CSV files in 100's of MByte in size.

528

asked Feb 15 '11 14:02

LMatter

1 Answers

I came up with an answer myself:

def UnicodeDictReader(utf8_data, **kwargs):     csv_reader = csv.DictReader(utf8_data, **kwargs)     for row in csv_reader:         yield {unicode(key, 'utf-8'):unicode(value, 'utf-8') for key, value in row.iteritems()}

_{Note: This has been updated so keys are decoded per the suggestion in the comments}

101

answered Sep 23 '22 09:09

LMatter

Related questions
                            
                                pip install test dependencies for tox from setup.py
                            
                                Split a string into 2 in Python
                            
                                What can you do with Lisp macros that you can't do with first-class functions?
                            
                                No module named django but it is installed
                            
                                No outlines on bins of Matplotlib histograms or Seaborn distplots
                            
                                Django 3.1 | Admin page appearance issue
                            
                                Comparing times with sub-second accuracy
                            
                                django: TypeError: 'tuple' object is not callable
                            
                                PyCharm - Is community edition able to highlight css/javascript?
                            
                                How to join list in Python but make the last separator different?
                            
                                How to get the samples in each cluster?
                            
                                how to convert pandas series to tuple of index and value
                            
                                Pad python floats
                            
                                Preferred way of defining properties in Python: property decorator or lambda?
                            
                                How to validate structure (or schema) of dictionary in Python?
                            
                                Python : name 'math' is not defined Error?
                            
                                Python logging and rotating files
                            
                                Mac OSX - AttributeError: 'FigureCanvasMac' object has no attribute 'restore_region'
                            
                                Python 3.4 and 2.7: Cannot install numpy package for python 3.4
                            
                                Why does list.append evaluate to false in a boolean context? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python CSV DictReader with UTF-8 data

Tags:

python

csv

unicode

LMatter

People also ask

1 Answers

LMatter

Recent Activity

Donate For Us