I have an array whose first dimension I need to subset/ index/ reorder. For example:
arr <- array(1:24, dim=c(4,3,2))
arr[4:1,,]
Simple, works like a charm.
However, is there a way to do this when I'm unsure how many dimensions the array has? To be clear, I will always know the size of that first dimension (i.e., I know dim(arr)[1]
), I just don't know length(dim(arr))
.
Here's a weird alternative. This idea is based on an implementation quirk I noticed at one point, that R seems to represent "missing" function arguments as symbols with zero-length names. One of the reasons that this is so bizarre is that R normally doesn't allow you to create symbols with zero-length names:
as.symbol('');
## Error in as.symbol("") : attempt to use zero-length variable name
But through some messing around, I discovered that you can slip past R's defenses by accessing the parse tree of an expression that involves a "missing" argument, and indexing out the element of the parse tree that contains the "missing" argument. Here's a demo of some of the weird behavior you get from this thing:
substitute(x[]); ## parse tree involving missing argument
## x[]
as.list(substitute(x[])); ## show list representation; third component is the guy
## [[1]]
## `[`
##
## [[2]]
## x
##
## [[3]]
##
##
substitute(x[])[[3]]; ## prints nothing!
##
(function(x) c(typeof(x),mode(x),class(x)))(substitute(x[])[[3]]); ## it's a symbol alright
## [1] "symbol" "name" "name"
as.character(substitute(x[])[[3]]); ## gets the name of the symbol: the empty string!
## [1] ""
i.dont.exist <- substitute(x[])[[3]]; ## store in variable
i.dont.exist; ## wha??
## Error: argument "i.dont.exist" is missing, with no default
Anyway, here's the solution we can derive for the OP's problem:
arr <- array(1:24,4:2);
do.call(`[`,c(list(arr,4:1),rep(list(substitute(x[])[[3]]),length(dim(arr))-1)));
## , , 1
##
## [,1] [,2] [,3]
## [1,] 4 8 12
## [2,] 3 7 11
## [3,] 2 6 10
## [4,] 1 5 9
##
## , , 2
##
## [,1] [,2] [,3]
## [1,] 16 20 24
## [2,] 15 19 23
## [3,] 14 18 22
## [4,] 13 17 21
##
I was hoping it would outperform all the other solutions, but @thelatemail, you win this round: Aha! I realized that we can precompute a list of an empty symbol (storing an empty symbol in a variable by itself, i.e. not in a list, is not usable, as I showed above) and rep()
that list in the solution, rather than incurring all the overhead of substitute()
to parse out a dummy expression in every invocation of the solution. And behold the performance:
straight <- function() arr[4:1,,];
jb <- function() do.call(`[`,c(list(arr,4:1),lapply(dim(arr)[-1],seq_len)));
tlm <- function() do.call(`[`,c(list(arr,4:1),rep(TRUE,length(dim(arr))-1)));
orderD1 <- function(x,ord) { dims <- dim(x); ndim <- length(dims); stopifnot(ndim>0); if (ndim==1) return(x[ord]); wl_i <- which(letters=="i"); dimLetters <- letters[wl_i:(wl_i+ndim-1)]; dimList <- structure(vector("list",ndim),.Names=dimLetters); dimList[[1]] <- ord; for (i in 2:ndim) dimList[[i]] <- 1:dims[i]; do.call("[",c(list(x=x),dimList)); };
rbatt <- function() orderD1(arr,4:1);
bgoldst <- function() do.call(`[`,c(list(arr,4:1),rep(list(substitute(x[])[[3]]),length(dim(arr))-1)));
ls0 <- list(substitute(x[])[[3]]);
ls0;
## [[1]]
##
##
bgoldst2 <- function() do.call(`[`,c(list(arr,4:1),rep(ls0,length(dim(arr))-1)));
microbenchmark(straight(),jb(),tlm(),rbatt(),bgoldst(),bgoldst2(),times=1e5);
## Unit: nanoseconds
## expr min lq mean median uq max neval
## straight() 428 856 1161.038 856 1284 998142 1e+05
## jb() 4277 5988 7136.534 6843 7271 1629357 1e+05
## tlm() 2566 3850 4622.668 4277 4705 1704196 1e+05
## rbatt() 24804 28226 31975.583 29509 31219 34970873 1e+05
## bgoldst() 3421 4705 5601.300 5132 5560 1918878 1e+05
## bgoldst2() 2566 3850 4533.383 4277 4705 1034065 1e+05
Just discovered that there's an easier way to get ahold of the empty symbol, which seems to have been available all along:
substitute();
##
My substitute(x[])[[3]]
trick now looks kind of stupid.
Out of curiosity I benchmarked using substitute()
directly against the other solutions, and it incurs a slight performance cost compared to bgoldst2()
, making it slightly worse than tlm()
:
bgoldst3 <- function() do.call(`[`,c(list(arr,4:1),rep(list(substitute()),length(dim(arr))-1)));
microbenchmark(straight(),jb(),tlm(),rbatt(),bgoldst(),bgoldst2(),bgoldst3(),times=1e5);
## Unit: nanoseconds
## expr min lq mean median uq max neval
## straight() 428 856 1069.340 856 1284 850603 1e+05
## jb() 4277 5988 6916.899 6416 7270 2978180 1e+05
## tlm() 2566 3849 4307.979 4277 4704 3138122 1e+05
## rbatt() 24377 28226 30882.666 29508 30364 36768360 1e+05
## bgoldst() 2994 4704 5165.019 5132 5560 2050171 1e+05
## bgoldst2() 2566 3849 4232.816 4277 4278 1085813 1e+05
## bgoldst3() 2566 3850 4545.508 4277 4705 1004131 1e+05
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With