I am unable to do a groupby on a pandas Series object. DataFrames are fine, but I cannot seem to do groupby with a Series. Has anyone been able to get this to work?
>>> import pandas as pd
>>> a = pd.Series([1,2,3,4], index=[4,3,2,1])
>>> a
4 1
3 2
2 3
1 4
dtype: int64
>>> a.groupby()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 153, in groupby
sort=sort, group_keys=group_keys)
File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 537, in groupby
return klass(obj, by, **kwds)
File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 195, in __init__
level=level, sort=sort)
File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1326, in _get_grouper
ping = Grouping(group_axis, gpr, name=name, level=level, sort=sort)
File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 1203, in __init__
self.grouper = self.index.map(self.grouper)
File "/share/apps/install/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 878, in map
return self._arrmap(self.values, mapper)
File "generated.pyx", line 2200, in pandas.algos.arrmap_int64 (pandas/algos.c:61221)
TypeError: 'NoneType' object is not callable
Group Series using a mapper or by a Series of columns. A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
Another simple aggregation example is to compute the size of each group. This is included in GroupBy as the size method. It returns a Series whose index are the group names and whose values are the sizes of each group. Another aggregation example is to compute the number of unique values of each group.
To calculate mean values grouped on another column in pandas, we will use groupby, and then we will apply mean() method. Pandas allow us a direct method called mean() which calculates the average of the set passed into it.
Sort the Series in Ascending Order By default, the pandas series sort_values() function sorts the series in ascending order. You can also use ascending=True param to explicitly specify to sort in ascending order. Also, if you have any NaN values in the Series, it sort by placing all NaN values at the end.
You need to pass a mapping of some kind (could be a dict/function/index)
In [6]: a
Out[6]:
4 1
3 2
2 3
1 4
dtype: int64
In [7]: a.groupby(a.index).sum()
Out[7]:
1 4
2 3
3 2
4 1
dtype: int64
In [3]: a.groupby(lambda x: x % 2 == 0).sum()
Out[3]:
False 6
True 4
dtype: int64
if you need to groupby series' values:
grouped = a.groupby(a)
or
grouped = a.groupby(lambda x: a[x])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With