I have a matrix (2601 by 58) of particulate matter concentration estimates from an air quality model. Because real-life air quality monitors cannot measure below 0.1 ug/L, I need to replace all values in my matrix that are <0.1
with a zero/NA/null value.
Someone suggested ifelse(test, true, false)
with a logical statement, but when I try this it deletes everything.
Replacing values in a data frame is a very handy option available in R for data analysis. Using replace() in R, you can switch NA, 0, and negative values with appropriate to clear up large datasets for analysis.
To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0. myDataframe is the data frame in which you would like replace all NAs with 0.
X = zeros( sz ) returns an array of zeros where size vector sz defines size(X) . For example, zeros([2 3]) returns a 2-by-3 matrix. X = zeros(___, typename ) returns an array of zeros of data type typename . For example, zeros('int8') returns a scalar, 8-bit integer 0 .
X[X < .1] <- 0
(or NA, although 0 sounds more appropriate in this case.)
Matrices are just vectors with dimensions, so you can treat them like a vector when you assign to them. In this case, you're creating a boolean vector over X that indicates the small values, and it assigns the right-hand-side to each element that's TRUE.
ifelse
should work:
mat <- matrix(runif(100),ncol=5) mat <- ifelse(mat<0.1,NA,mat)
But I would choose Harlan's answer over mine.
mat[mat < 0.1] <- NA
I think you will find that 'ifelse' is not a vector operation (its actually performing as a loop), and so it is orders of magnitudes slower than the vector equivalent. R favors vector operations, which is why apply, mapply, sapply are lightning fast for certain calculations.
Small Datasets, not a problem, but if you have an array of length 100k or more, you can go and cook a roast dinner before it finishes under any method involving a loop.
The below code should work.
For vector
minvalue <- 0
X[X < minvalue] <- minvalue
For Dataframe or Matrix.
minvalue <- 0
n <- 10 #change to whatever.
columns <- c(1:n)
X[X[,columns] < minvalue,columns] <- minvalue
Another fast method, via pmax and pmin functions, this caps entries between 0 and 1 and you can put a matrix or dataframe as the first argument no problems.
ulbound <- function(v,MAX=1,MIN=0) pmin(MAX,pmax(MIN,v))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With