We want to set all values in an array zero that are negative.
I tried out a a lot of stuff but did not yet achieve a working solution. I thought about a for loop with condition, however this seems not to work.
#pred_precipitation is our array
pred_precipitation <-rnorm(25,2,4)
for (i in nrow(pred_precipitation))
{
if (pred_precipitation[i]<0) {pred_precipitation[i] = 0}
else{pred_precipitation[i] = pred_precipitation[i]}
}
To convert negative values in a matrix to 0, we can use pmax function. For example, if we have a matrix called M that contains some negative and some positive and zero values then the negative values in M can be converted to 0 by using the command pmax(M,0).
Thanks for the reproducible example. This is pretty basic R stuff. You can assign to selected elements of a vector (note an array has dimensions, and what you've given is a vector not an array):
> pred_precipitation[pred_precipitation<0] <- 0
> pred_precipitation
[1] 1.2091281 0.0000000 7.7665555 0.0000000 0.0000000 0.0000000 0.5151504 0.0000000 1.8281251
[10] 0.5098688 2.8370263 0.4895606 1.5152191 4.1740177 7.1527742 2.8992215 4.5322934 6.7180530
[19] 0.0000000 1.1914052 3.6152333 0.0000000 0.3778717 0.0000000 1.4940469
Benchmark wars!
@James has found an even faster method and left it in a comment. I upvoted him, if only because I know his victory will be short-lived.
First, I try compiling, but that doesn't seem to help anyone:
p <- rnorm(10000)
gsk3 <- function(x) { x[x<0] <- 0; x }
jmsigner <- function(x) ifelse(x<0, 0, x)
joshua <- function(x) pmin(x,0)
james <- function(x) (abs(x)+x)/2
library(compiler)
gsk3.c <- cmpfun(gsk3)
jmsigner.c <- cmpfun(jmsigner)
joshua.c <- cmpfun(joshua)
james.c <- cmpfun(james)
microbenchmark(joshua(p),joshua.c(p),gsk3(p),gsk3.c(p),jmsigner(p),james(p),jmsigner.c(p),james.c(p))
expr min lq median uq max
1 gsk3.c(p) 251.782 255.0515 266.8685 269.5205 457.998
2 gsk3(p) 256.262 261.6105 270.7340 281.3560 2940.486
3 james.c(p) 38.418 41.3770 43.3020 45.6160 132.342
4 james(p) 38.934 42.1965 43.5700 47.2085 4524.303
5 jmsigner.c(p) 2047.739 2145.9915 2198.6170 2291.8475 4879.418
6 jmsigner(p) 2047.502 2169.9555 2258.6225 2405.0730 5064.334
7 joshua.c(p) 237.008 244.3570 251.7375 265.2545 376.684
8 joshua(p) 237.545 244.8635 255.1690 271.9910 430.566
But wait! Dirk wrote this Rcpp thing. Can a complete C++ incompetent read his JSS paper, adapt his example, and write the fastest function of them all? Stay tuned, dear listeners.
library(inline)
cpp_if_src <- '
Rcpp::NumericVector xa(a);
int n_xa = xa.size();
for(int i=0; i < n_xa; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
'
cpp_if <- cxxfunction(signature(a="numeric"), cpp_if_src, plugin="Rcpp")
microbenchmark(joshua(p),joshua.c(p),gsk3(p),gsk3.c(p),jmsigner(p),james(p),jmsigner.c(p),james.c(p), cpp_if(p))
expr min lq median uq max
1 cpp_if(p) 8.233 10.4865 11.6000 12.4090 69.512
2 gsk3(p) 170.572 172.7975 175.0515 182.4035 2515.870
3 james(p) 37.074 39.6955 40.5720 42.1965 2396.758
4 jmsigner(p) 1110.313 1118.9445 1133.4725 1164.2305 65942.680
5 joshua(p) 237.135 240.1655 243.3990 250.3660 2597.429
That's affirmative, captain.
This modifies the input p
even if you don't assign to it. If you want to avoid that behavior, you have to clone:
cpp_ifclone_src <- '
Rcpp::NumericVector xa(Rcpp::clone(a));
int n_xa = xa.size();
for(int i=0; i < n_xa; i++) {
if(xa[i]<0) xa[i] = 0;
}
return xa;
'
cpp_ifclone <- cxxfunction(signature(a="numeric"), cpp_ifclone_src, plugin="Rcpp")
Which unfortunately kills the speed advantage.
I would use pmax
because ifelse
can be a bit slow at times and subset-replacement creates an additional vector (which can be an issue with large data sets).
set.seed(21)
pred_precipitation <- rnorm(25,2,4)
p <- pmax(pred_precipitation,0)
Subset-replacement is by-far the fastest though:
library(rbenchmark)
gsk3 <- function(x) { x[x<0] <- 0; x }
jmsigner <- function(x) ifelse(x<0, 0, x)
joshua <- function(x) pmin(x,0)
benchmark(joshua(p), gsk3(p), jmsigner(p), replications=10000, order="relative")
test replications elapsed relative user.self sys.self
2 gsk3(p) 10000 0.215 1.000000 0.216 0.000
1 joshua(p) 10000 0.444 2.065116 0.416 0.016
3 jmsigner(p) 10000 0.656 3.051163 0.652 0.000
If your main object is a tibble or dataframe you can also use the tidy package. In comparison to the replacement proposed by Ari B. Friedman, the replacement could be written "on the fly" and combined with other mutations.
An example using dplyr and the %>%
pipes would look like this:
df %>% mutate(varA = if_else(varA < 0, 0, varA))
You can add further mutations (i.e., new variables) within the mutate()
statement. An advantage that I see in this type of coding is that you do not run the risk of skipping or re-executing an individual transformation step, since they are all grouped in one statement.
For example, by adding %>% View()
in RStudio you can already preview the result. However, the result is not yet stored anywhere ("on the fly"). This way you keep your namespace / environment clean when changing the code.
Alternatively you can also use ifelse
:
ifelse(pred_precipitation < 0, 0, pred_precipitation)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With