There are two questions that look similar but they're not the same question: here and here. They both call a method of GroupBy
, such as count()
or aggregate()
, which I know returns a DataFrame
. What I'm asking is how to convert the GroupBy
(class pandas.core.groupby.DataFrameGroupBy
) object itself into a DataFrame
. I'll illustrate below.
Construct an example DataFrame
as follows.
data_list = []
for name in ["sasha", "asa"]:
for take in ["one", "two"]:
row = {"name": name, "take": take, "score": numpy.random.rand(), "ping": numpy.random.randint(10, 100)}
data_list.append(row)
data = pandas.DataFrame(data_list)
The above DataFrame
should look like the following (with different numbers obviously).
name ping score take
0 sasha 72 0.923263 one
1 sasha 14 0.724720 two
2 asa 76 0.774320 one
3 asa 71 0.128721 two
What I want to do is to group by the columns "name" and "take" (in that order), so that I can get a DataFrame
indexed by the multiindex constructed from the columns "name" and "take", like below.
score ping
name take
sasha one 0.923263 72
two 0.724720 14
asa one 0.774320 76
two 0.128721 71
How do I achieve that? If I do grouped = data.groupby(["name", "take"])
, then grouped
is a pandas.core.groupby.DataFrameGroupBy
instance. What is the correct way of doing this?
You need set_index
:
data = data.set_index(['name','take'])
print (data)
ping score
name take
sasha one 46 0.509177
two 77 0.828984
asa one 51 0.637451
two 51 0.658616
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With