I want to replace 0
by NA
in each column of a data.table
.
library(data.table)
dt1 <- data.table(V1=0:2, V2=2:0)
dt1
V1 V2
1: 0 2
2: 1 1
3: 2 0
dt1==0
V1 V2
[1,] TRUE FALSE
[2,] FALSE FALSE
[3,] FALSE TRUE
I tried this one
dt1[dt1==0]
Error in `[.data.table`(dt1, dt1 == 0) :
i is invalid type (matrix). Perhaps in future a 2 column matrix could return a list of elements of DT (in the spirit of A[B] in FAQ 2.14). Please let datatable-help know if you'd like this, or add your comments to FR #1611.
And also tried this one
dt1[dt1==0, .SD :=NA]
Edited
Partial sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: i686-pc-linux-gnu (32-bit)
Running under: Ubuntu 14.04.2 LTS
data.table_1.9.4
Using R replace() function to update 0 with NA R has a built-in function called replace() that replaces values in a vector with another value, for example, zeros with NAs.
To replace a column value in R use square bracket notation df[] , By using this you can update values on a single column or on all columns. To refer to a single column use df$column_name .
data.table is an R package that provides an enhanced version of data.frame s, which are the standard data structure for storing data in base R. In the Data section above, we already created a data.table using fread() . We can also create one using the data.table() function.
You can try set
for multiple columns. It will be faster as the overhead of .[data.table
is avoided
for(j in seq_along(dt1)){
set(dt1, i=which(dt1[[j]]==0), j=j, value=NA)
}
dt1
# V1 V2
#1: NA 2
#2: 1 1
#3: 2 NA
Or another option would be looping with lapply
and then change the 0
values to NA with replace
dt1[, lapply(.SD, function(x) replace(x, which(x==0), NA))]
Or we can make use of some arthithmetic operations to convert the 0 value to NA.
dt1[, lapply(.SD, function(x) (NA^!x) *x)]
The way (NA^!x)*x
this works is by converting the !x
i.e. a logical TRUE/FALSE vector for each column (where TRUE corresponds to 0 value) to NA
and 1 by doing NA^!x
. We multiply with the x value to replace the 1 with the x value corresponding to it while the NA will remain as such.
Or a syntax similar to base R
would be
is.na(dt1) <- dt1==0
But this method may not be that efficient for large data.table as dt1==0
would be a logical matrix and also as @Roland mentioned in the comments that the dataset would be copied. I would either use the lapply
based or the more efficient set
for larger datasets.
dt1[dt1==0] <- NA
worked for me.
dt1[dt1==0] <- NA
dt1
## V1 V2
##1: NA 2
##2: 1 1
##3: 2 NA
As noted by Roland, this does make a copy of the data.table
object, and will be slower.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With