I have a df
that consist of 100 rows and 24 columns. The column type is string. It's throwing me the following error when I tried to append the data frame to KDB
UnicodeEncodeError: 'ascii' codec can't encode character '\xd3' in position 9: ordinal not in range(128)
Here is an example of the first row in my df.
AnnouncementDate AuctionDate BBT \
_id
00000067 2012-12-11T00:00:00.000+00:00 NaN FHLB
CouponDividendRate DaysToSettle \
_id
00000067 0.61 1
Description \
_id
00000067 FHLB 0.61 12/28/16
FirstSettlementDate ISN IsAgency IsWhenIssued \
_id
00000067 2012-12-28T00:00:00.000+00:00 US313381K796 True False
... OnTheRunTreasury OperationalIndicator \
_id ...
00000067 ... NaN False
OriginalAmountOfPrincipal OriginalMaturityDate \
_id
00000067 13000000.0 NaN
PrincipalAmountOutstanding SCSP SMCP \
_id
00000067 0.0 313381K79 76000000
SecurityTypeLevel1 SecurityTypeLevel2 TCK
_id
00000067 US-DOMESTIC NaN NaN
My question is, is there an easy way to convert my df
to utf-8 format?
Possibly something like df = df.encode('utf-8')
Thanks
It depends on how you're outputting the data. If you're simply using csv files, which you then import to KDB, then you can specify that easily:
df.to_csv('df_output.csv', encoding='utf-8')
Or, you can set the encoding when you import the data to Pandas originally, using the same syntax.
If you're connecting directly to KDB using SQLAlchemy or something similar, you should try specifying this in the connection itself - see this question: Another UnicodeEncodeError when using pandas method to_sql with MySQL
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With