I want to change the default value (which is 255) to NA.
dt <- data.table(x = c(1,5,255,0,NA), y = c(1,7,255,0,0), z = c(4,2,7,8,255))
coords <- c('x', 'y')
Which gives the following code:
x y z
1: 1 1 4
2: 5 7 2
3: 255 255 7
4: 0 0 8
5: NA 0 255
I the furthest I came up with is this:
dt[.SD == 255, (.SD) := NA, .SDcols = coords]
Please note that column z stays the same. So just the columns which are specified and not all columns.
But that doesn't help me to get the sollution:
x y z
1: 1 1 4
2: 5 7 2
3: NA NA 7
4: 0 0 8
5: NA 0 255
I am looking for a sustainable solution because the original dataset is a couple of million rows.
EDIT:
I have found a solution but it is quite ugly and is definately too slow as it takes almost 10 seconds to get through a dataframe of 22009 x 86. Does anyone have a better solution?
The code:
dt[, replace(.SD, .SD == 255, NA), .SDcols = coords, by = c(colnames(dt)[!colnames(dt) %in% coords])]
Here is how you can keep the columns outside .SDcols
,
library(data.table)
dt[, (coords) := replace(.SD, .SD == 255, NA), .SDcols = coords]
which gives,
x y z 1: 1 1 4 2: 5 7 2 3: NA NA 7 4: 0 0 8 5: NA 0 255
You could also do:
require(data.table)
dt[ ,
(coords) := lapply(.SD, function(x) fifelse(x == 255, NA_real_, x)),
.SDcols = coords ]
Having compared it to Sotos' answer, it also seems a little bit faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With