Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas rounding when converting float to integer

I've got a pandas DataFrame with a float (on decimal) index which I use to look up values (similar to a dictionary). As floats are not exactly the value they are supposed to be multiplied everything by 10 and converted it to integers .astype(int) before setting it as index. However this seems to do a floor instead of rounding. Thus 1.999999999999999992 is converted to 1 instead of 2. Rounding with the pandas.DataFrame.round() method before does not avoid this problem as the values are still stored as floats.

The original idea (which obviously rises a key error) was this:

idx = np.arange(1,3,0.001)
s = pd.Series(range(2000))
s.index=idx
print(s[2.022])

trying with converting to integers:

idx_int = idx*1000
idx_int = idx_int.astype(int)
s.index = idx_int
for i in range(1000,3000):
    print(s[i])

the output is always a bit random as the 'real' value of an integer can be slightly above or below the wanted value. In this case the index contains two times the value 1000 and does not contain the value 2999.

like image 852
NicoH Avatar asked Mar 07 '18 13:03

NicoH


People also ask

How do you convert all float columns to int in pandas?

To convert a column that includes a mixture of float and NaN values to int, first replace NaN values with zero on pandas DataFrame and then use astype() to convert. Use DataFrame. fillna() to replace the NaN values with integer value zero. Yields below output.

How do you convert float to int in Python?

Python also has a built-in function to convert floats to integers: int() . In this case, 390.8 will be converted to 390 . When converting floats to integers with the int() function, Python cuts off the decimal and remaining numbers of a float to create an integer.

How do pandas reduce decimal places?

Lets use the dataframe. round() function to round off all the decimal values in the dataframe to 3 decimal places.


4 Answers

You are right, astype(int) does a conversion toward zero:

‘integer’ or ‘signed’: smallest signed int dtype

from pandas.to_numeric documentation (which is linked from astype() for numeric conversions).

If you want to round, you need to do a float round, and then convert to int:

df.round(0).astype(int)

Use other rounding functions, according your needs.

like image 76
Giacomo Catenazzi Avatar answered Sep 24 '22 14:09

Giacomo Catenazzi


If I understand right you could just perform the rounding operation followed by converting it to an integer?

s1 = pd.Series([1.2,2.9])
s1 = s1.round().astype(int)

Which gives the output:

0    1
1    3
dtype: int32
like image 44
Matt Avatar answered Sep 20 '22 14:09

Matt


In case the data frame contains both, numeric and non-numeric values and you only want to touch numeric fields:

df = df.applymap(lambda x: int(round(x, 0)) if isinstance(x, (int, float)) else x)
like image 42
momo Avatar answered Sep 22 '22 14:09

momo


There is a potential that NA as a float type exists in the dataframe. so an alternative solution is: df.fillna(0).astype('int')

like image 20
Yuchao Jiang Avatar answered Sep 24 '22 14:09

Yuchao Jiang