I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df. Then when I try to export it to a csv: <pre class="prettyprint"><code>df.to_csv("path",header=True,index=False) </code></pre> I get this error: <pre class="prettyprint"><code>UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128) </code></pre> Can someone suggest a way to fix this and what it means? Thanks

You have <code>unicode</code> values in your DataFrame. Files store bytes, which means all <code>unicode</code> have to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such as <code>utf-8</code>. For example, <pre class="prettyprint"><code>df.to_csv('path', header=True, index=False, encoding='utf-8') </code></pre> If you don't specify an encoding, then the encoding used by <code>df.to_csv</code> defaults to <code>ascii</code> in Python2, or <code>utf-8</code> in Python3.

Adding an answer to help myself google it later: One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like: <pre class="prettyprint"><code>df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8')) </code></pre> This would get the dataframe to print correctly too.

Unicode Encode Error when writing pandas df to csv

Tags:

python

pandas

python-unicode

export-to-csv

I cleaned 400 excel files and read them into python using pandas and appended all the raw data into one big df.

Then when I try to export it to a csv:

df.to_csv("path",header=True,index=False)

I get this error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xc7' in position 20: ordinal not in range(128)

Can someone suggest a way to fix this and what it means?

Thanks

222

asked Jul 10 '15 02:07

collarblind

2 Answers

You have unicode values in your DataFrame. Files store bytes, which means all unicode have to be encoded into bytes before they can be stored in a file. You have to specify an encoding, such as utf-8. For example,

df.to_csv('path', header=True, index=False, encoding='utf-8')

If you don't specify an encoding, then the encoding used by df.to_csv defaults to ascii in Python2, or utf-8 in Python3.

182

answered Sep 30 '22 16:09

unutbu

Adding an answer to help myself google it later:

One trick that helped me is to encode a problematic series first, then decode it back to utf-8. Like:

df['crumbs'] = df['crumbs'].map(lambda x: x.encode('unicode-escape').decode('utf-8'))

This would get the dataframe to print correctly too.

answered Sep 30 '22 16:09

tangfucius

Related questions
                            
                                get the last sunday and saturday's date in python
                            
                                How to display line numbers in IPython Notebook code cell by default
                            
                                how to recursively iterate over XML tags in Python using ElementTree?
                            
                                Celery Get List Of Registered Tasks
                            
                                Why does my Pandas DataFrame not display new order using `sort_values`?
                            
                                How to convert from UTM to LatLng in python or Javascript
                            
                                Finding words after keyword in python
                            
                                Is there a quick way to decrease the indentation of multiple lines in Python?
                            
                                Unzip all zipped files in a folder to that same folder using Python 2.7.5
                            
                                Python Pandas: Calculate moving average within group
                            
                                How do you specify a default for a Django ForeignKey Model or AdminModel field?
                            
                                In python, how to check if a date is valid?
                            
                                How does Python's comma operator work during assignment?
                            
                                Python: required kwarg, which exception to raise?
                            
                                out of memory issue in installing packages on Ubuntu server
                            
                                How to run different python versions in cmd [duplicate]
                            
                                Django: Difference between using server through manage.py and other servers like gunicorn etc. Which is better?
                            
                                How to turn off dropout for testing in Tensorflow?
                            
                                Keras: change learning rate
                            
                                Can ElementTree be told to preserve the order of attributes?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With