Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeating values in a "group by" pandas dataframe

I have the following pandas DataFrame:

     email   cat  class_price
0   [email protected]  cat1            1
1   [email protected]  cat2            2
2   [email protected]  cat2            4
3   [email protected]  cat2            4
4   [email protected]  cat2            1
5   [email protected]  cat1            3
6   [email protected]  cat1            2
7   [email protected]  cat2            1
8   [email protected]  cat2            4
9   [email protected]  cat2            2
10  [email protected]  cat3            1
11  [email protected]  cat1            1

And I want to group by email and by class_price, for each line I want to take the max of class_price.

I'm using:

test_df2 = test_df.groupby(['email','cat'])['class_price'].max()

The output is:

email             cat 
[email protected]  cat1    2
                  cat2    4
[email protected]  cat2    2
                  cat3    1
[email protected]  cat1    3
                  cat2    4

But how can I get a result where even grouped columns retain repeated values,such that it can be be written as a proper table with all the values:

email             cat      maxvalue 
[email protected]    cat2     2
[email protected]    cat1     2
[email protected]    cat3     3

Note: example output isn't compatible with example input just written to explain the idea.

like image 564
stackit Avatar asked Apr 17 '16 12:04

stackit


People also ask

How do you repeat a series on pandas?

Pandas str. repeat() method is used to repeat string values in the same position of passed series itself. An array can also be passed in case to define the number of times each element should be repeated in series.

Can you loop in a DataFrame?

DataFrame Looping (iteration) with a for statement. You can loop over a pandas dataframe, for each column row by row.

How do you count occurrences of pandas?

Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.


1 Answers

You can try reset_index as in other answer or you can try below -


test_df2 = test_df.groupby(['email','cat'], as_index=False)['class_price'].max()

like image 145
Blue Bird Avatar answered Sep 22 '22 12:09

Blue Bird