I need to store Python decimal type values in a pandas TimeSeries
/DataFrame
object. Pandas gives me an error when using the "groupby" and "mean" on the TimeSeries/DataFrame. The following code based on floats works well:
[0]: by = lambda x: lambda y: getattr(y, x)
[1]: rng = date_range('1/1/2000', periods=40, freq='4h')
[2]: rnd = np.random.randn(len(rng))
[3]: ts = TimeSeries(rnd, index=rng)
[4]: ts.groupby([by('year'), by('month'), by('day')]).mean()
2000 1 1 0.512422
2 0.447235
3 0.290151
4 -0.227240
5 0.078815
6 0.396150
7 -0.507316
But i get an error if do the same using decimal values instead of floats:
[5]: rnd = [Decimal(x) for x in rnd]
[6]: ts = TimeSeries(rnd, index=rng, dtype=Decimal)
[7]: ts.groupby([by('year'), by('month'), by('day')]).mean() #Crash!
Traceback (most recent call last):
File "C:\Users\TM\Documents\Python\tm.py", line 100, in <module>
print ts.groupby([by('year'), by('month'), by('day')]).mean()
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 293, in mean
return self._cython_agg_general('mean')
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 365, in _cython_agg_general
raise GroupByError('No numeric types to aggregate')
pandas.core.groupby.GroupByError: No numeric types to aggregate
The error message is "GroupByError('No numeric types to aggregate')". Is there any chance to use the standard aggregations like sum, mean, and quantileon on the TimeSeries or DataFrame containing Decimal values?
Why doens't it work and is there a chance to have an equally fast alternative if it is not possible?
EDIT: I just realized that most of the other functions (min, max, median, etc.) work fine but not the mean function that i desperately need :-(.
To find mean of DataFrame, use Pandas DataFrame. mean() function. The DataFrame. mean() function returns the mean of the values for the requested axis.
Pandas Series: round() functionThe round() function is used to round each value in a Series to the given number of decimals. Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.
import numpy as np
ts.groupby([by('year'), by('month'), by('day')]).apply(np.mean)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With