Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterating over rows in pandas to check the condition

I have the following DF in pandas.

+-------+-------+
| Col_A | Col_B |
+-------+-------+
|  1234 |       |
|  6267 |       |
|  6364 |       |
|   573 |       |
|     0 |       |
|   838 |       |
|    92 |       |
|  3221 |       |
+-------+-------+

Col_B should either be filled with True or False values. By default, it is False, but when the first 0 has been "seen", the rest of DF should be True. The DF has over 100 000 rows.

What will be the fastest way to set values in col_B equal to "True" since the first "0" value in Col_A appears?

+-------+--------+
| Col_A | Col_B  |
+-------+--------+
|  1234 | False  |
|  6267 | False  |
|  6364 | False  |
|   573 | False  |
|     0 | True   |
|   838 | True   |
|    92 | True   |
|  3221 | True   |
+-------+--------+
like image 403
Pinky the mouse Avatar asked Aug 27 '18 15:08

Pinky the mouse


3 Answers

Using idxmax with loc for assignment

idx = df.Col_A.eq(0).idxmax()
df['Col_B'] = False
df.loc[idx:, 'Col_B'] = True

   Col_A  Col_B
0   1234  False
1   6267  False
2   6364  False
3    573  False
4      0   True
5    838   True
6     92   True
7   3221   True

Using assign:

This approach avoids modifying the original DataFrame.

df.assign(Col_B=(df.index >= idx))
like image 150
user3483203 Avatar answered Sep 21 '22 00:09

user3483203


Using eq with cummax

df.A.eq(0).cummax()
Out[5]: 
0    False
1    False
2    False
3    False
4     True
5     True
6     True
7     True
Name: A, dtype: bool
like image 43
BENY Avatar answered Sep 21 '22 00:09

BENY


You can use Numpy's accumulate method of the ufunc logical_or

df.assign(Col_B=np.logical_or.accumulate(df.Col_A.values == 0))

   Col_A  Col_B
0   1234  False
1   6267  False
2   6364  False
3    573  False
4      0   True
5    838   True
6     92   True
7   3221   True
like image 23
piRSquared Avatar answered Sep 22 '22 00:09

piRSquared