Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

r: conditionally replace values in a subset of columns

I have a dataframe like so:

sport   contract start contract end visits spends purchases
basket   2013-10-01     2014-10-01   12      14      23
basket   2014-02-12     2015-03-03   23      11      7
football 2015-02-12     2016-03-03   23      11      7
basket   2016-07-17     2013-09-09   12       7      13

I would like to conditionally replace the columns [4:6] with NAs, based on the variables "sport" and "contract start". So for instance:

i1 <- which(df$sport =="basket" & df$contract_start>="2014-01-01")

will index all the rows in which my conditions are met. Is there an easy piece of code to add to the above, that will replace df[4:6] with NAs given the above conditions? I would like to end up with something like that:

sport   contract start contract end visits spends purchases
basket   2013-10-01     2014-10-01   12      14      23
basket   2014-02-12     2015-03-03   NA      NA      NA
football 2015-02-12     2016-03-03   23      11      7
basket   2016-07-17     2013-09-09   NA      NA      NA

Thanks! A.

like image 820
La Machine Infernale Avatar asked Dec 19 '22 14:12

La Machine Infernale


2 Answers

You can simply specify the rows and columns that you would like to replace with NA, and assign NA to it:

df[df$sport =="basket" & df$contract_start>="2014-01-01", 4:6] <- NA

df
#      sport contract_start contract_end visits spends purchases
# 1   basket     2013-10-01   2014-10-01     12     14        23
# 2   basket     2014-02-12   2015-03-03     NA     NA        NA
# 3 football     2015-02-12   2016-03-03     23     11         7
# 4   basket     2016-07-17   2013-09-09     NA     NA        NA
like image 134
Psidom Avatar answered Feb 13 '23 05:02

Psidom


library("data.table")
setDT(df)
df[i = sport == "basket" & contract_start >= "2014-01-01", 
   j = c("visits", "spends", "purchases") := NA]

> df
      sport contract_start contract_end visits spends purchases
1:   basket     2013-10-01   2014-10-01     12     14        23
2:   basket     2014-02-12   2015-03-03     NA     NA        NA
3: football     2015-02-12   2016-03-03     23     11         7
4:   basket     2016-07-17   2013-09-09     NA     NA        NA

Variant of the above code using the my_cols variable:

my_cols <- names(df)[4:6]
df[i = sport == "basket" & contract_start >= "2014-01-01", 
   j = (my_cols) := .(NA)]
like image 43
Sathish Avatar answered Feb 13 '23 06:02

Sathish