what am i missing here? I am trying to do a group by.
asp = np.array(np.array([0,0,1]))
asq = np.array(np.array([10,10,20]))
columns=['asp']
df = pd.DataFrame(asp, index=None, columns=columns)
df['asq'] = asq
print df
df.groupby(by=['asp']).sum()
print df
asp asq
0 0 10
1 0 10
2 1 20
asp asq
0 0 10
1 0 10
2 1 20
results should be:
asp asq
0 0 20
1 1 20
The Hello, World! of pandas GroupBy You call . groupby() and pass the name of the column that you want to group on, which is "state" . Then, you use ["last_name"] to specify the columns on which you want to perform the actual aggregation. You can pass a lot more than just a single column name to .
From the docs: "NA groups in GroupBy are automatically excluded".
You can group DataFrame rows into a list by using pandas. DataFrame. groupby() function on the column of interest, select the column you want as a list from group and then use Series. apply(list) to get the list for every group.
Pandas groupby is used for grouping the data according to the categories and apply a function to the categories. It also helps to aggregate data efficiently. Pandas dataframe. groupby() function is used to split the data into groups based on some criteria.
df.groupby
doesn't change df
; it returns a new object. In this case you perform an aggregation operation, so you get a new DataFrame
. You have to give a name to the result if you want to use it later:
>>> df_summed = df.groupby('asp').sum()
>>> df_summed
asq
asp
0 20
1 20
[2 rows x 1 columns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With