First and foremost, I'm sorry for the clearly not minimal examples that I listed below. I am fully aware this doesn't meet SO's minimally reproducible constraint, however, having been experimenting now for hours trying to recreate the issue, it really seems to me it only arises when calculation is performed on at least a couple of hundreds of values.
I have a dataframe with millions of values where I want to calculate kurtosis in each column on a rolling basis. Initally I used pd.rolling.kurt:
df.rolling(20, min_periods=3).kurt(bias=False)
but noticed two serious issues with that approach:
I created three series, s1,s2, and s3 with 300, 600, and 900 values respectively. (Assignments with the exact values are added at the end of this post so as not to cause much trouble following my post.) These three series are slices from one column of the dataframe. The slices are created in such a way that the last position is fixed, i.e. s1 has values from N-299 to N, s2 from N-599 to N and s3 from N-899 to N. Running pd.rolling.kurt on these three series and printing the tail of the dataframe (where the issue I want to talk about appears) gives the following:
>>> s1.rolling(20,min_periods=3).kurt().tail(10)
290 9.591067
291 9.591067
292 9.591067
293 9.591067
294 19.663666
295 14.872262
296 14.147157
297 16.716964
298 7.032522
299 19.983796
>>> s2.rolling(20,min_periods=3).kurt().tail(10)
590 9.591067
591 9.591067
592 9.591067
593 9.591067
594 19.663666
595 14.872262
596 14.147157
597 16.716964
598 7.032522
599 19.983796
>>> s3.rolling(20,min_periods=3).kurt().tail(10)
890 9.591071
891 9.591071
892 9.591071
893 9.591071
894 19.663685
895 15.248361
896 40.444894
897 1368.233241
898 251407.375343
899 902540.031652
I performed the same computation in Excel and for the last ten indices, the kurtosis values should be the following (I used the notation 290 / 590 / 890 to save some space: the three output series have the same values for index values 290-299, 590-599, and 890-899):
290 / 590 / 890 9.591067361
291 / 591 / 891 9.591067361
292 / 592 / 892 9.591067361
293 / 593 / 893 9.591067361
294 / 594 / 894 19.66366573
295 / 595 / 895 14.87226197
296 / 596 / 896 14.14715754
297 / 597 / 897 16.7169886
298 / 598 / 898 7.037037037
299 / 599 / 899 20
Observing the outputs provided by pd.rolling.kurt we see that the first two outputs are identical, although they do not match with the real output I computed using Excel. However, the even larger problem happens with the third output where the values explode as if the total number of values in the series would somehow influence the kurtosis values, even though for all three cases I used a rolling window of 20 with a minimum required number of 3. Theoretically, if my understanding is correct, this means that nothing else should interfere with the kurtosis output besides the current and the 19 last rows. I'm puzzled how these "exploding" values can appear.
I then recomputed the kurtosis values for the same series using scipy.stats.kurtosis. This gave me the following output:
>>> s1.rolling(20,min_periods=3).apply(lambda x: kurtosis(x, bias=False)).tail(10)
290 9.591067
291 9.591067
292 9.591067
293 9.591067
294 19.663666
295 14.872262
296 14.147158
297 16.716989
298 7.037037
299 20.000000
>>> s2.rolling(20,min_periods=3).apply(lambda x: kurtosis(x, bias=False)).tail(10)
590 9.591067
591 9.591067
592 9.591067
593 9.591067
594 19.663666
595 14.872262
596 14.147158
597 16.716989
598 7.037037
599 20.000000
>>> s3.rolling(20,min_periods=3).apply(lambda x: kurtosis(x, bias=False)).tail(10)
890 9.591067
891 9.591067
892 9.591067
893 9.591067
894 19.663666
895 14.872262
896 14.147158
897 16.716989
898 7.037037
899 20.000000
This computes the kurtosis perfectly. However, the .apply(lambda x: kurtosis(x,...) construct is shockingly inefficient compared to the vectorized pandas approach, pushing the total processing time for the entire dataframe from a couple of minutes all the way to more than an hour! I am fully aware that in many cases an inbuilt vectorized solution tends to prefer speed over numerical accuracy which would explain the first issue I listed above; however, as for the second issue (i.e. "exploding" values) I simply don't see a justification.
Is there any way to compute the kurtosis efficiently without values diverging and invalidating my whole output?
Series definitions
Here come the exact values I used to compute the aforementioned outputs:
s1 = pd.Series([0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0001499887511247459,-7.499156348433101e-05,-3.699790962233055e-05,-1.899945851585629e-05,-8.999869502079515e-06,-4.999962500264377e-06,-1.999992000039351e-06,-9.999974999814318e-07,-9.999984999603102e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.699983850190338e-05,-8.999878501628346e-06,-3.999972000122605e-06,-1.999992000039351e-06,-9.999974999814318e-07,0.0003669319382432873,-0.0001849488621671012,-9.198730581664589e-05,-4.499687272496313e-05,0.0009075453820856781,0.0004854184782060238,-0.000720221831477389,-0.000359805708801156,-0.0001799514136040646,-8.998785170075082e-05,-5.999640023402946e-05,-1.9999600008734e-05,-6.999954500263924e-06,-1.999995999958864e-06,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001201278176363365,-0.0008013581550363867,-0.0002669288650428971,-8.89921242557729e-05,-2.899914452727788e-05,-9.99990000099588e-06,-2.999989500049026e-06,-9.999984999603102e-07,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0005218638053935734,-0.0004638654873286288,-3.799851806232993e-05,-1.299982450270071e-05,-4.999977500118572e-06,-9.999984999603102e-07,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0])
s2 = pd.Series([0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0001499887511247459,-7.499156348433101e-05,-3.699790962233055e-05,-1.899945851585629e-05,-8.999869502079515e-06,-4.999962500264377e-06,-1.999992000039351e-06,-9.999974999814318e-07,-9.999984999603102e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.699983850190338e-05,-8.999878501628346e-06,-3.999972000122605e-06,-1.999992000039351e-06,-9.999974999814318e-07,0.0003669319382432873,-0.0001849488621671012,-9.198730581664589e-05,-4.499687272496313e-05,0.0009075453820856781,0.0004854184782060238,-0.000720221831477389,-0.000359805708801156,-0.0001799514136040646,-8.998785170075082e-05,-5.999640023402946e-05,-1.9999600008734e-05,-6.999954500263924e-06,-1.999995999958864e-06,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001201278176363365,-0.0008013581550363867,-0.0002669288650428971,-8.89921242557729e-05,-2.899914452727788e-05,-9.99990000099588e-06,-2.999989500049026e-06,-9.999984999603102e-07,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0005218638053935734,-0.0004638654873286288,-3.799851806232993e-05,-1.299982450270071e-05,-4.999977500118572e-06,-9.999984999603102e-07,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0])
s3 = pd.Series([0.0006613932897393013,0.0002659978876289742,0.000658737582405648,0.0005623339888467145,0.0008417590777197284,0.000542090011101782,0.0007813756301534222,0.0003713395103963933,0.0001847566192768637,0.0005892778635844672,-0.0001955367110279687,0.0004436264576506058,0.000302660947173135,0.0007556577955957223,0.0004099113835531532,0.0002143017625986564,1.052211101549051e-05,6.481751166152551e-05,6.615670911548045e-05,-2.169766854576383e-05,-1.302819997635433e-05,-7.303052044212008e-06,-0.1163297855507419,-0.06335289603465369,-0.03314811069814094,-0.01697505737063765,-0.008591697883893402,-0.004342398361182662,-0.002157940126839023,-0.001100682037128825,-0.0005507856703497119,-0.0002554269710891206,-0.0001277329565522002,-8.395111298446951e-05,-2.189884089509773e-05,-1.094960028496637e-05,-5.479844975342307e-06,-2.739933748392279e-06,-1.369969689294177e-06,-6.799856523827107e-07,-3.399929995978179e-07,-1.79996340600251e-07,-7.999838400850306e-08,-3.999919442393075e-08,-2.999939675042158e-08,-2.007979819879551e-05,-1.004005030070562e-05,-5.52007060169889e-06,-2.760046727695654e-06,9.150125677134498e-06,4.580031464668292e-06,2.2900078662783e-06,1.150001972312828e-06,5.700004873407606e-07,2.80000120302654e-07,1.50000032247295e-07,7.000000733862829e-08,3.000000181016647e-08,2.000000056662899e-08,1.00000003333145e-08,1.000000011126989e-08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0001499887511247459,-7.499156348433101e-05,-3.699790962233055e-05,-1.899945851585629e-05,-8.999869502079515e-06,-4.999962500264377e-06,-1.999992000039351e-06,-9.999974999814318e-07,-9.999984999603102e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.699983850190338e-05,-8.999878501628346e-06,-3.999972000122605e-06,-1.999992000039351e-06,-9.999974999814318e-07,0.0003669319382432873,-0.0001849488621671012,-9.198730581664589e-05,-4.499687272496313e-05,0.0009075453820856781,0.0004854184782060238,-0.000720221831477389,-0.000359805708801156,-0.0001799514136040646,-8.998785170075082e-05,-5.999640023402946e-05,-1.9999600008734e-05,-6.999954500263924e-06,-1.999995999958864e-06,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.001201278176363365,-0.0008013581550363867,-0.0002669288650428971,-8.89921242557729e-05,-2.899914452727788e-05,-9.99990000099588e-06,-2.999989500049026e-06,-9.999984999603102e-07,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0005218638053935734,-0.0004638654873286288,-3.799851806232993e-05,-1.299982450270071e-05,-4.999977500118572e-06,-9.999984999603102e-07,-9.999994999391884e-07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0])
It looks like a bug in older Pandas version. I could reproduce on an old installation Python 3.6.2 64 bit on win32, Pandas 1.0.3, numpy 1.15.4:
>>> s3.rolling(20,min_periods=3).kurt().tail(10)
890 9.591071
891 9.591071
892 9.591071
893 9.591071
894 19.663685
895 15.248361
896 40.444894
897 1368.233241
898 251407.375343
899 902540.031652
dtype: float64
It seems to be fixed on my newer version, Python 3.8.4 64 bit, Pandas 1.2.2, numpy 1.20.1:
>>> s3.rolling(20,min_periods=3).kurt().tail(10)
890 9.591067
891 9.591067
892 9.591067
893 9.591067
894 19.663666
895 14.872262
896 14.147158
897 16.716989
898 7.037037
899 20.000000
dtype: float64
both installations on the same Windows 10 machine.
I cannot say which component (Pandas or numpy) is the cause. As your tests using numpy.stats.kurtosis give correct result, I would suspect Pandas, but without further analysis by Pandas experts (and I am not one) I cannot be affirmative.
IMHO, the most reasonable solution is either to upgrade your system, or add a fresh new independant Python installation with the last possible Pandas version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With