This question can be considered related to this one, that helped me to improve the R performances in computing the mean on a big array. Unfortunately, in this case I'm trying to apply something more complex (like a quantile calculation).
I have a 4-D array with more than 40 millions of elements and I want to calculate the 66th percentile on a specific dimension. Here there is the MATLAB code:
> n = randn(100, 50, 100, 20);
> tic; q = quantile(n, 0.66, 4); toc
Elapsed time is 0.440824 seconds.
Let's do something similar in R.
> n = array(rnorm(100*50*100*20), dim = c(100,50,100,20))
> start = Sys.time(); q = apply(n, 1:3, quantile, .66); print(Sys.time() - start)
Time difference of 1.600693 mins
I was aware of the better performances of MATLAB wrt R but in this case I don't know what to do. Probably I just need to wait 2 minutes instead of one second... I hope someone can suggest me any way to improve running times, anyway, thank you in advance...
UPDATE I've applied some of the suggestions into the comments and I've reduced the running time:
> start = Sys.time(); q = apply(n, 1:3, quantile, .66, names = FALSE); print(Sys.time() - start)
Time difference of 33.42773 secs
We're still far from the MATLAB performances but at least I've learnt something.
UPDATE I put here some advancements related to `quantile' function discussed here. The running time of same code I've shown above has passed from 33 to 5 seconds...
The RcppOctave package calls the GNU Octave API functions; if you don't already know about GNU Octave, it is very similar to Matlab and aims for complete compatiility.
This is nearly as fast as Matlab direct...
library(RcppOctave)
set.seed(1)
n = array(rnorm(100*50*100*20), dim = c(100,50,100,20))
system.time( s <- octave_style_quantile(n, .66, 4) )
## user system elapsed
## 0.526 0.048 0.574
# *R* `quantile` argument `type=5` is the method that matches matlab.
system.time( r <- apply(n, 1:3, quantile, .66, names = FALSE, type=5) )
## user system elapsed
## 23.308 0.029 23.327
dim(r)
## [1] 100 50 100
identical(r,s)
## [1] TRUE
A fairly straight forward translation of Octave's quantile.m to R.
octave_style_quantile <- function (x, p=NULL, dim=NULL) {
if ( is.null(p) ) p <- c(0.00, 0.25, 0.50, 0.75, 1.00)
if ( is.null(dim) ) {
## Find the first non-singleton dimension.
dim <- which(dim(x) > 1)[1];
}
stopifnot( is.numeric(x)||is.logical(x),
is.numeric(p),
dim <= length(dim(x)) )
## Set the permutation vector.
perm <- seq_along(dim(x))
perm[1] <- dim
perm[dim] <- 1
## Permute dim to the 1st index.
x <- aperm(x, perm);
## Save the size of the permuted x N-d array.
sx = dim(x);
## Reshape to a 2-d array.
dim(x) <- c( sx[1], prod(sx[-1]) );
## Calculate the quantiles.
q = .CallOctave("quantile",x,p)
## Return the shape to the original N-d array.
dim(q) <- c( length(p), sx[-1] )
## Permute the 1st index back to dim.
q = aperm(q, perm);
if( any(dim(q)==1) ) dim(q) <- dim(q)[-which(dim(q)==1)]
q
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With