Using Pandas to Find Minimum Values of Grouped Rows

Question

This might be a trivial question but I'm still trying to figure out pandas/numpy.

So, suppose I have a table with the following structure:

group_id | col1 | col2 | col3 |  "A"   |  "B"
   x     |   1  |   2  |  3   |  NaN   |   1
   x     |   3  |   2  |  3   |   1    |   1 
   x     |   4  |   2  |  3   |   2    |   1
   y     |   1  |   2  |  3   |  NaN   |   3 
   y     |   3  |   2  |  3   |   3    |   3 
   z     |   3  |   2  |  3   |   10   |   2
   z     |   2  |   2  |  3   |   6    |   2
   z     |   4  |   2  |  3   |   4    |   2
   z     |   4  |   2  |  3   |   2    |   2

Note that there is a group_id that groups elements in each row. So at the beginning, I have the values for columns group_id and col1-col3.

Then for each row, if col1, col2, or col3 have value = 1, then "A" is NaN, otherwise the value is based on a formula (irrelevant for here so I put some numbers in place).

That, I know how to do using:

df["A"] = np.where(((df['col1'] == 1)|(df['col2']== 1) | (df['col3']) == 1))), NaN, value)

But for column "B", I need to fill it in with the minimum of values from column A for a specific group.

So for example, "B" is equal to "1" for all rows with group X because the minimum value in column A for all of the group "x" rows is equal to 1.

Similarly, for rows in group "y", the minimum value is 3, and for group "z" the minimum value is 2. How exactly do I do that using pandas...? It's confusing me a little more because the number of rows for a specific group can be of varying size.

If they were all the same size I could just say fill it with the minimum of values in a pre-set range.

I hope that made sense; please let me know if I should provide a clearer example or clarify anything!

Ted Petrou · Accepted Answer

To get the minimum of column A for each group use transform

df.groupby('group_id')['A'].transform('min')

piRSquared · Answer

focus on just ['col1', 'col2', 'col3']
see if they are equal to 1 with eq(1) equivalent to == 1
see if any are equal to one along axis=1 with any(1)
use loc to make assignment

anyone = df[['col1', 'col2', 'col3']].eq(1).any(1)
df.loc[anyone, 'A'] = np.nan

numpy equivalent

anyone = (df[['col1', 'col2', 'col3']].values == 1).any(1)
df.A = np.where(anyone, np.nan, df.A)

Zhaoyun Ma · Answer

df.groupby('group_id')['A'].min()

Using Pandas to Find Minimum Values of Grouped Rows

Tags:

python

pandas

dataframe

numpy

shishy

3 Answers

Ted Petrou

piRSquared

Zhaoyun Ma

Recent Activity

Donate For Us

Using Pandas to Find Minimum Values of Grouped Rows

Tags:

python

pandas

dataframe

numpy

shishy

3 Answers

Ted Petrou

piRSquared

Zhaoyun Ma

Related questions

Recent Activity

Donate For Us