Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find minimum value in a column based on condition in an another column of a dataframe?

Tags:

python

pandas

I have a dataframe like below:

Number  Req Response 
0       3    6
1       5    0
2       33   4
3       15   3
4       12   2

I would like to identify minimum 'Response' value before the 'Req' is 15.

i tried the below code:

min_val=[]
for row in range(len(df)):
#if the next row of 'Req' contains 15, append the current row value of'Response'
  if(df[row+1].loc[df[row+1]['Req'] == 15]): 
         min_val.append(df['Response'].min())
  else:
         min_val.append(0)

I get 'invalid type comparison' error.

I expect the below output:

Min value of df['Response'] is: 0
like image 294
hakuna_code Avatar asked Aug 28 '19 13:08

hakuna_code


2 Answers

If possible value 15 is not in data, use general solution:

df = df.reset_index(drop=True)
out = df.loc[df.Req.eq(15)[::-1].cumsum().ne(0), 'Response'].sort_values()
print (out)
1    0
3    3
2    4
0    6
Name: Response, dtype: int64

print (next(iter(out), 'no match'))
0

Details:

print (df.Req.eq(15))
0    False
1    False
2    False
3     True
4    False
Name: Req, dtype: bool

print (df.Req.eq(15)[::-1])
4    False
3     True
2    False
1    False
0    False
Name: Req, dtype: bool

print (df.Req.eq(15)[::-1].cumsum())
4    0
3    1
2    1
1    1
0    1
Name: Req, dtype: int32

print (df.Req.eq(15)[::-1].cumsum().ne(0))
4    False
3     True
2     True
1     True
0     True
Name: Req, dtype: bool

Test with not matched value:

print (df)
   Number  Req  Response
0       0    3         6
1       1    5         0
2       2   33         4
3       3  150         3
4       4   12         2


df = df.reset_index(drop=True)
out = df.loc[df.Req.eq(15)[::-1].cumsum().ne(0), 'Response'].sort_values()
print (out)
Series([], Name: Response, dtype: int64)

print (next(iter(out), 'no match'))
no match
like image 93
jezrael Avatar answered Sep 21 '22 05:09

jezrael


One way could be using idxmax to find the first index where Req is equal to 15, use the result to index the dataframe and take the minimum Response:

df.loc[:df.Req.eq(15).idxmax(), 'Response'].min()
# 0

Where:

df.Req.eq(15)

0    False
1    False
2    False
3     True
4    False
Name: Req, dtype: bool

And the idxmax will return the index of the first True occurrence, in this case 3.

like image 37
yatu Avatar answered Sep 21 '22 05:09

yatu