I have the following code running and it's taking me a long time to run. How do I know if it's still doing its job or it got stuck somewhere.
noise4<-NULL;
for(i in 1:length(noise3))
{
if(is.na(noise3[i])==TRUE)
{
next;
}
else
{
noise4<-c(noise4,noise3[i]);
}
}
noise3 is a vector with 2418233 data points.
You just want to remove the NA values. Do it like this:
noise4 <- noise3[!is.na(noise3)]
This will be pretty much instant.
Or as Joshua suggests, a more readable alternative:
noise4 <- na.omit(noise3)
Your code was slow because:
The memory reallocation is probably the biggest handicap to your code.
I wanted to illustrate the benefits of pre-allocation, so I tried to run your code... but I killed it after ~5 minutes. I recommend you use noise4 <- na.omit(noise3)
as I said in my comments. This code is solely for illustrative purposes.
# Create some random data
set.seed(21)
noise3 <- rnorm(2418233)
noise3[sample(2418233, 100)] <- NA
noise <- function(noise3) {
# Pre-allocate
noise4 <- vector("numeric", sum(!is.na(noise3)))
for(i in seq_along(noise3)) {
if(is.na(noise3[i])) {
next
} else {
noise4[i] <- noise3[i]
}
}
}
system.time(noise(noise3)) # MUCH less than 5+ minutes
# user system elapsed
# 9.50 0.44 9.94
# Let's see what we gain from compiling
library(compiler)
cnoise <- cmpfun(noise)
system.time(cnoise(noise3)) # a decent reduction
# user system elapsed
# 3.46 0.49 3.96
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With