I have a dataframe like this, each row represents for some trips that an individual made:
IndividualID Ifmarried Sex Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 Trip7 Trip8 Trip9
200100001 1 2 23 1 2 4 4 4 1 5 5
200100002 1 1 21 2 12 3 1 55 7 7
200100003 2 1 12 3 1 6 3
200100004 1 2 1
200100005 1 2 6 5 1 9 3 5 6
200100006 1 2 23 4 4 1 4 3 6 5
What I would like to do is: For each row, if there is a trip whose value is 1, then delete all the following trips. So hopefully the new dataframe would look like this:
IndividualID Ifmarried Sex Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 Trip7 Trip8 Trip9
200100001 1 2 23 1
200100002 1 1 21 2 12 3 1
200100003 2 1 12 3 1
200100004 1 2 1
200100005 1 2 6 5 1
200100006 1 2 23 4 4 1
Please note that in columns 'Ifmarried' and 'Sex' there could also be a '1', so I will have to make sure they are not messed with the trips.
Thanks in advance!
Use the double cumsum trick:
v = df.filter(like='Trip')
df.iloc[:, 3:] = v[v.eq(1).cumsum(1).cumsum(1).le(1)].fillna('')
df
IndividualID Ifmarried Sex Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 Trip7 \
0 200100001 1 2 23 1
1 200100002 1 1 21 2 12 3 1
2 200100003 2 1 12 3 1
3 200100004 1 2 1
4 200100005 1 2 6 5 1
5 200100006 1 2 23 4 4 1
Trip8 Trip9
0
1
2
3
4
5
You can using shift with cumsum
df.iloc[:,3:]=df.iloc[:,3:][df.iloc[:,3:].eq(1).shift(axis=1).cumsum(1).fillna(0).eq(0)]
df
Out[85]:
IndividualID Ifmarried Sex Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 \
0 200100001 1 2 23 1.0 NaN NaN NaN NaN
1 200100002 1 1 21 2.0 12.0 3.0 1.0 NaN
2 200100003 2 1 12 3.0 1.0 NaN NaN NaN
3 200100004 1 2 1 NaN NaN NaN NaN NaN
4 200100005 1 2 6 5.0 1.0 NaN NaN NaN
5 200100006 1 2 23 4.0 4.0 1.0 NaN NaN
Trip7 Trip8 Trip9
0 NaN NaN NaN
1 NaN NaN NaN
2 NaN NaN NaN
3 NaN NaN NaN
4 NaN NaN NaN
5 NaN NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With