Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas group by will not work

Tags:

python

pandas

what am i missing here? I am trying to do a group by.

asp = np.array(np.array([0,0,1]))
asq = np.array(np.array([10,10,20]))
columns=['asp']
df = pd.DataFrame(asp, index=None, columns=columns)
df['asq'] = asq
print df
df.groupby(by=['asp']).sum()
print df
   asp  asq
0    0   10
1    0   10
2    1   20
   asp  asq
0    0   10
1    0   10
2    1   20

results should be:

    asp  asq
0    0   20
1    1   20
like image 713
Tampa Avatar asked Dec 31 '13 03:12

Tampa


People also ask

How does group by work in pandas?

The Hello, World! of pandas GroupBy You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .

Does pandas groupby ignore NaN?

From the docs: "NA groups in GroupBy are automatically excluded".

How do you get groupby rows in pandas?

You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.

What is group by () in pandas library?

Pandas groupby is used for grouping the data according to the categories and apply a function to the categories. It also helps to aggregate data efficiently. Pandas dataframe. groupby() function is used to split the data into groups based on some criteria.


1 Answers

df.groupby doesn't change df; it returns a new object. In this case you perform an aggregation operation, so you get a new DataFrame. You have to give a name to the result if you want to use it later:

>>> df_summed = df.groupby('asp').sum()
>>> df_summed
     asq
asp     
0     20
1     20

[2 rows x 1 columns]
like image 64
DSM Avatar answered Oct 21 '22 05:10

DSM