I have the following DF in pandas.
+-------+-------+
| Col_A | Col_B |
+-------+-------+
| 1234 | |
| 6267 | |
| 6364 | |
| 573 | |
| 0 | |
| 838 | |
| 92 | |
| 3221 | |
+-------+-------+
Col_B should either be filled with True or False values. By default, it is False, but when the first 0 has been "seen", the rest of DF should be True. The DF has over 100 000 rows.
What will be the fastest way to set values in col_B equal to "True" since the first "0" value in Col_A appears?
+-------+--------+
| Col_A | Col_B |
+-------+--------+
| 1234 | False |
| 6267 | False |
| 6364 | False |
| 573 | False |
| 0 | True |
| 838 | True |
| 92 | True |
| 3221 | True |
+-------+--------+
idxmax
with loc
for assignmentidx = df.Col_A.eq(0).idxmax()
df['Col_B'] = False
df.loc[idx:, 'Col_B'] = True
Col_A Col_B
0 1234 False
1 6267 False
2 6364 False
3 573 False
4 0 True
5 838 True
6 92 True
7 3221 True
assign
:This approach avoids modifying the original DataFrame.
df.assign(Col_B=(df.index >= idx))
Using eq
with cummax
df.A.eq(0).cummax()
Out[5]:
0 False
1 False
2 False
3 False
4 True
5 True
6 True
7 True
Name: A, dtype: bool
You can use Numpy's accumulate
method of the ufunc logical_or
df.assign(Col_B=np.logical_or.accumulate(df.Col_A.values == 0))
Col_A Col_B
0 1234 False
1 6267 False
2 6364 False
3 573 False
4 0 True
5 838 True
6 92 True
7 3221 True
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With