I am trying to achieve something similar to this question but with multiple values that must be replaced by NA, and in large dataset.
df <- data.frame(name = rep(letters[1:3], each = 3), foo=rep(1:9),var1 = rep(1:9), var2 = rep(3:5, each = 3))
which generates this dataframe:
df
name foo var1 var2
1 a 1 1 3
2 a 2 2 3
3 a 3 3 3
4 b 4 4 4
5 b 5 5 4
6 b 6 6 4
7 c 7 7 5
8 c 8 8 5
9 c 9 9 5
I would like to replace all occurrences of, say, 3 and 4 by NA, but only in the columns that start with "var".
I know that I can use a combination of []
operators to achieve the result I want:
df[,grep("^var[:alnum:]?",colnames(df))][
df[,grep("^var[:alnum:]?",colnames(df))] == 3 |
df[,grep("^var[:alnum:]?",colnames(df))] == 4
] <- NA
df
name foo var1 var2
1 a 1 1 NA
2 a 2 2 NA
3 a 3 NA NA
4 b 4 NA NA
5 b 5 5 NA
6 b 6 6 NA
7 c 7 7 5
8 c 8 8 5
9 c 9 9 5
Now my questions are the following:
|
operator?You can also do this using replace
:
sel <- grepl("var",names(df))
df[sel] <- lapply(df[sel], function(x) replace(x,x %in% 3:4, NA) )
df
# name foo var1 var2
#1 a 1 1 NA
#2 a 2 2 NA
#3 a 3 NA NA
#4 b 4 NA NA
#5 b 5 5 NA
#6 b 6 6 NA
#7 c 7 7 5
#8 c 8 8 5
#9 c 9 9 5
Some quick benchmarking using a million row sample of data suggests this is quicker than the other answers.
You could also do:
col_idx <- grep("^var", names(df))
values <- c(3, 4)
m1 <- as.matrix(df[,col_idx])
m1[m1 %in% values] <- NA
df[col_idx] <- m1
df
# name foo var1 var2
#1 a 1 1 NA
#2 a 2 2 NA
#3 a 3 NA NA
#4 b 4 NA NA
#5 b 5 5 NA
#6 b 6 6 NA
#7 c 7 7 5
#8 c 8 8 5
#9 c 9 9 5
Here's an approach:
# the values that should be replaced by NA
values <- c(3, 4)
# index of columns
col_idx <- grep("^var", names(df))
# [1] 3 4
# index of values (within these columns)
val_idx <- sapply(df[col_idx], "%in%", table = values)
# var1 var2
# [1,] FALSE TRUE
# [2,] FALSE TRUE
# [3,] TRUE TRUE
# [4,] TRUE TRUE
# [5,] FALSE TRUE
# [6,] FALSE TRUE
# [7,] FALSE FALSE
# [8,] FALSE FALSE
# [9,] FALSE FALSE
# replace with NA
is.na(df[col_idx]) <- val_idx
df
# name foo var1 var2
# 1 a 1 1 NA
# 2 a 2 2 NA
# 3 a 3 NA NA
# 4 b 4 NA NA
# 5 b 5 5 NA
# 6 b 6 6 NA
# 7 c 7 7 5
# 8 c 8 8 5
# 9 c 9 9 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With