Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change only the maximum value of a group in pandas dataframe

Tags:

python

pandas

I have following dataset

Item Count
A    60
A    20
A    21
B    33
B    33
B    32

Code to reproduce:

import pandas as pd
df = pd.DataFrame([
    ['A', 60],
    ['A', 20],
    ['A', 21],
    ['B', 33],
    ['B', 33],
    ['B', 32],
], 
    columns=['Item', 'Count'])

Suppose I have to Change only the maximum value of each group of "Item" column by adding 1.

the output should be like this:

Item Count New_Count
A    60    61
A    20    20
A    21    21
B    33    34
B    33    34
B    32    32

I tried df['New_Count']=df.groupby(['Item'])['Count'].transform(lambda x: max(x)+1) but all the values in "Count" was replaced by max value of each group +1.

Item Count New_Count
A    60    61
A    20    61
A    21    61
B    33    34
B    33    34
B    32    34
like image 984
Yubraj Bhusal Avatar asked Jun 24 '20 12:06

Yubraj Bhusal


People also ask

How do you get the maximum value of each group in a pandas DataFrame in Python?

To get the maximum value of each group, you can directly apply the pandas max() function to the selected column(s) from the result of pandas groupby.

How do you select the maximum value of a DataFrame?

Pandas DataFrame max() Method The max() method returns a Series with the maximum value of each column. By specifying the column axis ( axis='columns' ), the max() method searches column-wise and returns the maximum value for each row.

Where is Max value in Panda series?

In the pandas series constructor, there is a method called argmax() which is used to get the position of maximum value over the series data. The pandas series is a single-dimensional data structure object with row index values. By using row index values we can access the data.


Video Answer


4 Answers

Use idxmax:

idx = df.groupby("Item")["Count"].idxmax()
df["New_Count"] = df["Count"]
df.loc[idx, "New_Count"] += 1

This will only increment the first occurrence of th maximum in each group. If you want to increment all the maximum values in the case of a tie, you can use transform instead. Just replace the first line above with:

idx = df.groupby("Item")["Count"].transform(max) == df["Count"]
like image 181
Dan Avatar answered Oct 22 '22 11:10

Dan


You can use idxmax() to get the idx of the maximum for each group, and increment only these items, like this:

max_idxs = df.groupby(['Item'])['Count'].idxmax()
df['New_Count']=df['Count'] # copy entire column
df['New_Count'][max_idxs]+=1 # increment only the maximum item for each group by 1
like image 5
Adam.Er8 Avatar answered Oct 22 '22 11:10

Adam.Er8


Here's another way not using groupby but using duplicated

df.loc[~df.sort_values('Count', ascending=False).duplicated('Item'), 'Count'] += 1

Output:

  Item  Count
0    A     61
1    A     20
2    A     21
3    B     34
4    B     33
5    B     32
like image 5
Scott Boston Avatar answered Oct 22 '22 11:10

Scott Boston


to change the value in all the maximum values ​​that are repeated you will need .groupby(), .join() and np.where()

df = pd.DataFrame([
['A', 60],
['A', 60],
['A', 20],
['A', 21],
['B', 21],
['B', 33],
['B', 34],
], columns=['Item', 'Count'])

s = df.groupby('Item')['Count'].max().rename('newCount')
df = df.set_index('Item').join(s).reset_index()
df['newCount'] = np.where(df['Count'] != df['newCount'], df['Count'], (df['newCount'] + 1))
df.head(10)
#output
    Item    Count   newCount
0   A       60      61
1   A       60      61
2   A       20      20
3   A       21      21
4   B       21      21
5   B       33      33
6   B       34      35

Edit

We can replace the .join() with a .transform() as suggested by @Dan

df['newCount'] = df.groupby('Item')['Count'].transform('max')
df['newCount'] = np.where(df['Count'] != df['newCount'], df['Count'], (df['newCount'] + 1))
#output
    Item    Count   newCount
0   A       60      61
1   A       60      61
2   A       20      20
3   A       21      21
4   B       21      21
5   B       33      33
6   B       34      35
like image 1
Terry Avatar answered Oct 22 '22 11:10

Terry