Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How use the mean method on a pandas TimeSeries with Decimal type values?

I need to store Python decimal type values in a pandas TimeSeries/DataFrame object. Pandas gives me an error when using the "groupby" and "mean" on the TimeSeries/DataFrame. The following code based on floats works well:

[0]: by = lambda x: lambda y: getattr(y, x)

[1]: rng = date_range('1/1/2000', periods=40, freq='4h')

[2]: rnd = np.random.randn(len(rng))

[3]: ts = TimeSeries(rnd, index=rng)

[4]: ts.groupby([by('year'), by('month'), by('day')]).mean()
2000  1  1    0.512422
         2    0.447235
         3    0.290151
         4   -0.227240
         5    0.078815
         6    0.396150
         7   -0.507316

But i get an error if do the same using decimal values instead of floats:

[5]: rnd = [Decimal(x) for x in rnd]       

[6]: ts = TimeSeries(rnd, index=rng, dtype=Decimal)

[7]: ts.groupby([by('year'), by('month'), by('day')]).mean()  #Crash!

Traceback (most recent call last):
File "C:\Users\TM\Documents\Python\tm.py", line 100, in <module>
print ts.groupby([by('year'), by('month'), by('day')]).mean()
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 293, in mean
return self._cython_agg_general('mean')
File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 365, in _cython_agg_general
raise GroupByError('No numeric types to aggregate')
pandas.core.groupby.GroupByError: No numeric types to aggregate

The error message is "GroupByError('No numeric types to aggregate')". Is there any chance to use the standard aggregations like sum, mean, and quantileon on the TimeSeries or DataFrame containing Decimal values?

Why doens't it work and is there a chance to have an equally fast alternative if it is not possible?

EDIT: I just realized that most of the other functions (min, max, median, etc.) work fine but not the mean function that i desperately need :-(.

like image 330
THM Avatar asked Jul 12 '12 19:07

THM


People also ask

How do you calculate the mean of a panda?

To find mean of DataFrame, use Pandas DataFrame. mean() function. The DataFrame. mean() function returns the mean of the values for the requested axis.

How do you round numbers in pandas series?

Pandas Series: round() functionThe round() function is used to round each value in a Series to the given number of decimals. Number of decimal places to round to (default: 0). If decimals is negative, it specifies the number of positions to the left of the decimal point.


1 Answers

import numpy as np
ts.groupby([by('year'), by('month'), by('day')]).apply(np.mean)
like image 157
ely Avatar answered Oct 08 '22 05:10

ely