Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Shift Converts Ints to Float AND Rounds

When shifting column of integers, I know how to fix my column when Pandas automatically converts the integers to floats because of the presence of a NaN. I basically use the method described here.

However, if the shift introduces a NaN thereby converting all integers to floats, there's some rounding that happens (e.g. on epoch timestamps) so even recasting it back to integer doesn't replicate what it was originally.

Any way to fix this?

Example Data:

pd.DataFrame({'epochee':[1495571400259317500,1495571400260585120,1495571400260757200, 1495571400260866800]})
Out[19]: 
                 epoch
0  1495571790919317503
1  1495999999999999999
2  1495571400265555555
3  1495571400267777777

Example Code:

df['prior_epochee'] = df['epochee'].shift(1)
df.dropna(axis=0, how='any', inplace=True)
df['prior_epochee'] = df['prior_epochee'].astype(int)

Resulting output:

Out[22]: 
                 epoch          prior_epoch
1  1444444444444444444  1400000000000000000
2  1433333333333333333  1490000000000000000
3  1777777777777777777  1499999999999999948
like image 552
guy Avatar asked Nov 07 '22 19:11

guy


1 Answers

Because you know what happens when int is casted as float due to np.nan and you know that you don't want the np.nan rows anyway, you can shift yourself with numpy

df[1:].assign(prior_epoch=df.epoch.values[:-1])

                 epoch          prior_epoch
1  1495571400260585120  1495571400259317500
2  1495571400260757200  1495571400260585120
3  1495571400260866800  1495571400260757200
like image 176
piRSquared Avatar answered Nov 15 '22 08:11

piRSquared