I've come across this post on how to replace occurrences of a number in all columns of a data frame (e.g. replace all 4 by 10 in all columns): DF[DF == 4] <- 10
. With data tables the same results can be achieved in exactly the same way: DT[DT == 4] <- 10
.
However, how should I procede if I want to apply this modification but only to specific columns from the data table, whether these columns are specified by position (e.g. 2:4
) or by name (e.g. c("V2", "V3", "V4")
)?
I will favor an "elegant" solution rather than iterations over every column.
Use str_replace_all() method of stringr package to replace multiple string values with another list of strings on a single column in R and update part of a string with another string.
We can use set
which would be more efficient
for(j in 2:4) {
set(DT, i = which(DT[[j]]==4), j=j, value = 10)
}
DT
# V1 V2 V3 V4
#1: A 2 2 10
#2: B 1 10 10
#3: C 3 10 3
#4: D 3 2 10
#5: E 3 3 3
#6: F 10 3 3
The above can be done with column names as well
for(j in names(DT)[2:4]){
set(DT, i = which(DT[[j]]==4), j=j, value = 10)
}
Or another option is to specify the .SDcols
with the columns of interest (either the numeric index or the column names), loop through the Subset of Data.table (.SD
), replace
the values that are 4 to 10 and assign (:=
) the output back to columns of interest
DT[, (2:4) := lapply(.SD, function(x) replace(x, x==4, 10)), .SDcols = 2:4]
Or with column names
DT[, (names(DT)[2:4]) := lapply(.SD, function(x) replace(x, x==4, 10)),
.SDcols = names(DT)[2:4]]
set.seed(24)
DT <- data.table(V1 = LETTERS[1:6], V2 = sample(1:4, 6, replace = TRUE),
V3 = sample(2:4, 6, replace = TRUE), V4 = sample(3:4, 6, replace= TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With