Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas convert dataframe to Utf-8

I have a df that consist of 100 rows and 24 columns. The column type is string. It's throwing me the following error when I tried to append the data frame to KDB

UnicodeEncodeError: 'ascii' codec can't encode character '\xd3' in position 9: ordinal not in range(128)

Here is an example of the first row in my df.

                        AnnouncementDate AuctionDate    BBT  \
_id
00000067   2012-12-11T00:00:00.000+00:00         NaN   FHLB

           CouponDividendRate DaysToSettle  \
_id
00000067                 0.61            1

                                        Description  \
_id
00000067                         FHLB 0.61 12/28/16

                     FirstSettlementDate           ISN IsAgency IsWhenIssued  \
_id
00000067   2012-12-28T00:00:00.000+00:00  US313381K796     True        False


           ...  OnTheRunTreasury OperationalIndicator  \
_id        ...
00000067   ...               NaN                False


          OriginalAmountOfPrincipal OriginalMaturityDate  \
_id
00000067                 13000000.0                  NaN


          PrincipalAmountOutstanding       SCSP       SMCP  \
_id
00000067                         0.0  313381K79   76000000

           SecurityTypeLevel1 SecurityTypeLevel2   TCK
_id
00000067          US-DOMESTIC                NaN   NaN

My question is, is there an easy way to convert my df to utf-8 format?

Possibly something like df = df.encode('utf-8')

Thanks

like image 747
Chris Johnson Avatar asked Mar 08 '23 11:03

Chris Johnson


1 Answers

It depends on how you're outputting the data. If you're simply using csv files, which you then import to KDB, then you can specify that easily:

df.to_csv('df_output.csv', encoding='utf-8')

Or, you can set the encoding when you import the data to Pandas originally, using the same syntax.

If you're connecting directly to KDB using SQLAlchemy or something similar, you should try specifying this in the connection itself - see this question: Another UnicodeEncodeError when using pandas method to_sql with MySQL

like image 137
Ricky McMaster Avatar answered Mar 20 '23 00:03

Ricky McMaster