Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete certain cells in each row, using python pandas?

Tags:

python

pandas

I have a dataframe like this, each row represents for some trips that an individual made:

IndividualID    Ifmarried Sex Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 Trip7 Trip8 Trip9
200100001       1         2    23    1     2     4     4      4    1     5     5
200100002       1         1    21    2     12    3     1      55   7     7
200100003       2         1    12    3     1     6     3     
200100004       1         2    1   
200100005       1         2    6     5     1     9     3      5    6  
200100006       1         2    23    4     4     1     4      3    6     5 

What I would like to do is: For each row, if there is a trip whose value is 1, then delete all the following trips. So hopefully the new dataframe would look like this:

IndividualID    Ifmarried Sex Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 Trip7 Trip8 Trip9
200100001       1         2    23    1  
200100002       1         1    21    2     12    3     1   
200100003       2         1    12    3     1          
200100004       1         2    1   
200100005       1         2    6     5     1      
200100006       1         2    23    4     4     1      

Please note that in columns 'Ifmarried' and 'Sex' there could also be a '1', so I will have to make sure they are not messed with the trips.

Thanks in advance!

like image 288
Steward Avatar asked Apr 11 '26 12:04

Steward


2 Answers

Use the double cumsum trick:

v = df.filter(like='Trip')
df.iloc[:, 3:] = v[v.eq(1).cumsum(1).cumsum(1).le(1)].fillna('')

df
   IndividualID  Ifmarried  Sex  Trip1 Trip2 Trip3 Trip4 Trip5 Trip6 Trip7  \
0     200100001          1    2     23     1                                 
1     200100002          1    1     21     2    12     3     1               
2     200100003          2    1     12     3     1                           
3     200100004          1    2      1                                       
4     200100005          1    2      6     5     1                           
5     200100006          1    2     23     4     4     1                     

  Trip8 Trip9  
0              
1              
2              
3              
4              
5              
like image 139
cs95 Avatar answered Apr 14 '26 00:04

cs95


You can using shift with cumsum

df.iloc[:,3:]=df.iloc[:,3:][df.iloc[:,3:].eq(1).shift(axis=1).cumsum(1).fillna(0).eq(0)]
df
Out[85]: 
   IndividualID  Ifmarried  Sex  Trip1  Trip2  Trip3  Trip4  Trip5  Trip6  \
0     200100001          1    2     23    1.0    NaN    NaN    NaN    NaN   
1     200100002          1    1     21    2.0   12.0    3.0    1.0    NaN   
2     200100003          2    1     12    3.0    1.0    NaN    NaN    NaN   
3     200100004          1    2      1    NaN    NaN    NaN    NaN    NaN   
4     200100005          1    2      6    5.0    1.0    NaN    NaN    NaN   
5     200100006          1    2     23    4.0    4.0    1.0    NaN    NaN   
   Trip7  Trip8  Trip9  
0    NaN    NaN    NaN  
1    NaN    NaN    NaN  
2    NaN    NaN    NaN  
3    NaN    NaN    NaN  
4    NaN    NaN    NaN  
5    NaN    NaN    NaN  
like image 44
BENY Avatar answered Apr 14 '26 01:04

BENY



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!