I am writing a program for which it is important to compare and rank values in a date series. However, I am running into problems with the imprecision of floats
I am pulling these data from my SQL server that are both supposed to be 1.6. However, they turn out to be slightly different (see below). Therefore, when I use dataframe.rank(), it doesn't treat these two dates as the same rank, but rather ranks 01/02/2004 above 02/01/2005.
Anyone have any idea how to deal with this so that these two would end up on the same rank?
modelInputData.loc['01/02/2004',('Level','inflationCore','EUR')]
Out[126]: 1.6000000000000003
modelInputData.loc['02/01/2005',('Level','inflationCore','EUR')]
Out[127]: 1.6000000000000001
You can use pd.Series.round() on the columns with floats.
precision = 2
df['col'] = df['col'].round(decimals = precision)
See: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.round.html
I would recommend you to do it as bankers do - use cents and integers instead of EUR/USD and float/decimal variables
either convert it to cents on the MySQL side or do it in pandas:
df['amount'] = round(df['amount']*100)
You'll have much less problems then
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With