I'm starting with input data like this
df1 = pandas.DataFrame( { "Name" : ["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"] , "City" : ["Seattle", "Seattle", "Portland", "Seattle", "Seattle", "Portland"] } )
Which when printed appears as this:
City Name 0 Seattle Alice 1 Seattle Bob 2 Portland Mallory 3 Seattle Mallory 4 Seattle Bob 5 Portland Mallory
Grouping is simple enough:
g1 = df1.groupby( [ "Name", "City"] ).count()
and printing yields a GroupBy
object:
City Name Name City Alice Seattle 1 1 Bob Seattle 2 2 Mallory Portland 2 2 Seattle 1 1
But what I want eventually is another DataFrame object that contains all the rows in the GroupBy object. In other words I want to get the following result:
City Name Name City Alice Seattle 1 1 Bob Seattle 2 2 Mallory Portland 2 2 Mallory Seattle 1 1
I can't quite see how to accomplish this in the pandas documentation. Any hints would be welcome.
to_frame() function is used to convert the given series object to a dataframe. Parameter : name : The passed name should substitute for the series name (if it has one).
groupby() To Group Rows into List. By using DataFrame. gropby() function you can group rows on a column, select the column you want as a list from the grouped result and finally convert it to a list for each group using apply(list).
g1
here is a DataFrame. It has a hierarchical index, though:
In [19]: type(g1) Out[19]: pandas.core.frame.DataFrame In [20]: g1.index Out[20]: MultiIndex([('Alice', 'Seattle'), ('Bob', 'Seattle'), ('Mallory', 'Portland'), ('Mallory', 'Seattle')], dtype=object)
Perhaps you want something like this?
In [21]: g1.add_suffix('_Count').reset_index() Out[21]: Name City City_Count Name_Count 0 Alice Seattle 1 1 1 Bob Seattle 2 2 2 Mallory Portland 2 2 3 Mallory Seattle 1 1
Or something like:
In [36]: DataFrame({'count' : df1.groupby( [ "Name", "City"] ).size()}).reset_index() Out[36]: Name City count 0 Alice Seattle 1 1 Bob Seattle 2 2 Mallory Portland 2 3 Mallory Seattle 1
I want to slightly change the answer given by Wes, because version 0.16.2 requires as_index=False
. If you don't set it, you get an empty dataframe.
Source:
Aggregation functions will not return the groups that you are aggregating over if they are named columns, when
as_index=True
, the default. The grouped columns will be the indices of the returned object.Passing
as_index=False
will return the groups that you are aggregating over, if they are named columns.Aggregating functions are ones that reduce the dimension of the returned objects, for example:
mean
,sum
,size
,count
,std
,var
,sem
,describe
,first
,last
,nth
,min
,max
. This is what happens when you do for exampleDataFrame.sum()
and get back aSeries
.nth can act as a reducer or a filter, see here.
import pandas as pd df1 = pd.DataFrame({"Name":["Alice", "Bob", "Mallory", "Mallory", "Bob" , "Mallory"], "City":["Seattle","Seattle","Portland","Seattle","Seattle","Portland"]}) print df1 # # City Name #0 Seattle Alice #1 Seattle Bob #2 Portland Mallory #3 Seattle Mallory #4 Seattle Bob #5 Portland Mallory # g1 = df1.groupby(["Name", "City"], as_index=False).count() print g1 # # City Name #Name City #Alice Seattle 1 1 #Bob Seattle 2 2 #Mallory Portland 2 2 # Seattle 1 1 #
EDIT:
In version 0.17.1
and later you can use subset
in count
and reset_index
with parameter name
in size
:
print df1.groupby(["Name", "City"], as_index=False ).count() #IndexError: list index out of range print df1.groupby(["Name", "City"]).count() #Empty DataFrame #Columns: [] #Index: [(Alice, Seattle), (Bob, Seattle), (Mallory, Portland), (Mallory, Seattle)] print df1.groupby(["Name", "City"])[['Name','City']].count() # Name City #Name City #Alice Seattle 1 1 #Bob Seattle 2 2 #Mallory Portland 2 2 # Seattle 1 1 print df1.groupby(["Name", "City"]).size().reset_index(name='count') # Name City count #0 Alice Seattle 1 #1 Bob Seattle 2 #2 Mallory Portland 2 #3 Mallory Seattle 1
The difference between count
and size
is that size
counts NaN values while count
does not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With