I received a csv file exported from a MySQL database (I think the encoding is latin1 since the language is spanish). Unfortunately the encoding is wrong and I cannot process it at all. If I use file:
$ file -I file.csv
file.csv: text/plain; charset=unknown-8bit
I have tried to read the file in python and convert it to utf-8 like:
r.decode('latin-1').encode("utf-8")
or using mysql_latin1_codec:
r.decode('mysql_latin1').encode('UTF-8')
I am trying to transform the data into json objects. The error comes when I save the file:
'UnicodeEncodeError: 'ascii' codec can't encode characters in position'
Do you know how can I convert it to normal utf-8 chars? Or how can I convert data to a valid json? Thanks!!
I got really good results by using pandas dataframe from Continuum Analytics.
You coud do something like:
import pandas as pd
from pandas import *
con='Your database connection credentials user, password, host, database to use'
data=pd.read_sql_query('SELECT * FROM YOUR TABLE',conn=con)
Then you could do:
data.to_csv('path_with_file_name')
or to convert to JSON:
data.to_json(orient='records')
or if you prefer to customize your json format see the documentation here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With