In Pandas with Groupby: assign a value from a column conditioned on another column

Tags:

I have a DataFrame like this:

df = pd.DataFrame({'col0': list('aabb'), 
                   'col1': np.arange(4),
                   'col2': list('wxyz'),
                   'col3': np.nan})

    col0 col1 col2 col3
0   a    0    w    NaN
1   a    1    x    NaN
2   b    2    y    NaN
3   b    3    z    NaN

I want to assign to 'col3' the value of 'col2' corresponding to the minimum value of 'col1', grouped by 'col0'. Expected output:

    col0 col1 col2 col3
0   a    0    w    w
1   a    1    x    w
2   b    2    y    y
3   b    3    z    y

If grouping by 'col0' was not needed, this would work:

df['col3'] = df[df['col1']==df['col1'].min()]['col2'].iloc[0]

    col0 col1 col2 col3
0   a    0    w    w
1   a    1    x    w
2   b    2    y    w
3   b    3    z    w

Similarly, this is my try using groupby/apply, which doesn't work as expected:

df['col3'] = df.groupby('col0').apply(lambda x: x[x['col1']==x['col1'].min()]['col2'].iloc[0])

    col0 col1 col2 col3
0   a    0    w    NaN
1   a    1    x    NaN
2   b    2    y    NaN
3   b    3    z    NaN

735

asked Jun 17 '21 17:06

makpalan

3 Answers

another transforming with idxmin and loc:

df["col3"] = df.groupby("col0").col1.transform(lambda x: df.loc[x.idxmin(), "col2"])

to get

  col0  col1 col2 col3
0    a     0    w    w
1    a     1    x    w
2    b     2    y    y
3    b     3    z    y

133

answered Nov 14 '22 06:11

Mustafa Aydın

you can use groupby.apply to get a series and then merge it into the df

df
  col0  col1 col2
0    a     0    w
1    a     1    x
2    b     2    y
3    b     3    z

col3 = df.groupby("col0").apply(lambda x: x.loc[x["col1"].idxmin(), "col2"])
col3.name = "col3"
df = df.merge(col3, how="left", left_on= "col0", right_index= True)

df
 col0  col1 col2 col3
0    a     0    w    w
1    a     1    x    w
2    b     2    y    y
3    b     3    z    y

answered Nov 14 '22 06:11

Stryder

you can groupby with transform idxmin and then series.map:

d = dict(zip(df['col1'],df['col2']))
df['col3'] = df['col3'].fillna(df.groupby("col0")['col1'].transform('idxmin').map(d))

print(df)

  col0  col1 col2 col3
0    a     0    w    w
1    a     1    x    w
2    b     2    y    y
3    b     3    z    y

answered Nov 14 '22 04:11

anky

Related questions
                            
                                Writing data from a Python List and a Dictionary to CSV
                            
                                How to implement Grad-CAM on a trained network
                            
                                Poetry could not find a pyproject.toml file in C:\
                            
                                How to serialise and deserialise complex POCO data structures in Python to/from JSON
                            
                                The wikipedia api seems to almost always get the word in question wrong
                            
                                Automatically simplify redundant arithmetic relations
                            
                                lask.cli.NoAppException: While importing "app", an ImportError was raised:
                            
                                Color percentage in image for Python using OpenCV
                            
                                Getting 403 when using Selenium to automate checkout process
                            
                                ImportError: Spatial indexes require either `rtree` or `pygeos` in geopanda but rtree is installed
                            
                                Pandas sort_value() issue. Wrong sorting integer when applied key parameter
                            
                                Scraping data from a dynamic web table
                            
                                str.encode() giving unexpected results
                            
                                How to fill the values in the list and convert it into the dataframe?
                            
                                Making a ML model scikit-learn compatible
                            
                                InvalidArgumentError: required broadcastable shapes at loc(unknown)
                            
                                Forward fill only certain value
                            
                                How to get the target by adding using python
                            
                                VS Code portable on Linux is still using for packages local user folder instead of the enviroment folder, and because of that imports fail
                            
                                What is the Sobel operator?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In Pandas with Groupby: assign a value from a column conditioned on another column

Tags:

python

pandas

pandas-groupby

apply