Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fastest way to get Min from every column in a matrix?

What is the fastest way to extract the min from each column in a matrix?


EDIT:

Moved all the benchmarks to the answer below.

Using a Tall, Short or Wide Matrix:

  ##  TEST DATA
  set.seed(1)
  matrix.inputs <- list(
        "Square Matrix"     = matrix(sample(seq(1e6), 4^2*1e4, T), ncol=400),   #  400 x  400
        "Tall Matrix"       = matrix(sample(seq(1e6), 4^2*1e4, T), nrow=4000),  # 4000 x   40
        "Wide-short Matrix" = matrix(sample(seq(1e6), 4^2*1e4, T), ncol=4000),  #   40 x 4000
        "Wide-tall Matrix"  = matrix(sample(seq(1e6), 4^2*1e5, T), ncol=4000),   #  400 x 4000
        "Tiny Sq Matrix"    = matrix(sample(seq(1e6), 4^2*1e2, T), ncol=40)     #   40 x   40
  )
like image 737
Ricardo Saporta Avatar asked Dec 03 '12 03:12

Ricardo Saporta


4 Answers

mat[(1:ncol(mat)-1)*nrow(mat)+max.col(t(-mat))] seems pretty fast, and it's base R.

like image 155
Museful Avatar answered Oct 26 '22 14:10

Museful


The sos package is great for answering these sorts of questions.

library("sos")
findFn("colMins")
library("matrixStats")
?colMins

http://finzi.psych.upenn.edu/R/library/matrixStats/html/rowRanges.html

Oddly enough, for the one example I tried colMins was slower. Perhaps someone can point out what's funny about my example?

set.seed(101); z <- matrix(runif(1e6),nrow=1000)
library(rbenchmark)
benchmark(colMins(z),apply(z,2,min))
##               test replications elapsed relative user.self sys.self
## 2 apply(z, 2, min)          100  14.290     1.00     7.216    7.057
## 1       colMins(z)          100  25.585     1.79    15.509    9.852
like image 11
Ben Bolker Avatar answered Nov 08 '22 23:11

Ben Bolker


Here is one that is faster on square and wide matrices. It uses pmin on the rows of the matrix. (If you know a faster way of splitting the matrix into its rows, please feel free to edit)

do.call(pmin, lapply(1:nrow(mat), function(i)mat[i,]))

Using the same benchmark as @RicardoSaporta:

$`Square Matrix`
          test elapsed relative
3 pmin.on.rows   1.370    1.000
1          apl   1.455    1.062
2         cmin   2.075    1.515

$`Wide Matrix`
      test elapsed relative
3 pmin.on.rows   0.926    1.000
2         cmin   2.302    2.486
1          apl   5.058    5.462

$`Tall Matrix`
          test elapsed relative
1          apl   1.175    1.000
2         cmin   2.126    1.809
3 pmin.on.rows   5.813    4.947
like image 9
flodel Avatar answered Nov 09 '22 00:11

flodel


Update 2014-12-17:

colMins() et al. were made significantly faster in a recent version of matrixStats. Here's an updated benchmark summary using matrixStats 0.12.2 showing that it ("cmin") is ~5-20 times faster than the second fastest approach:

$`Square Matrix`
     test elapsed relative
2    cmin   0.216    1.000
1     apl   4.200   19.444
5 pmn.int   4.604   21.315
4     pmn   5.136   23.778
3    lapl  12.546   58.083

$`Tall Matrix`
     test elapsed relative
2    cmin   0.262    1.000
1     apl   3.006   11.473
5 pmn.int  18.605   71.011
3    lapl  22.798   87.015
4     pmn  27.583  105.279

$`Wide-short Matrix`
     test elapsed relative
2    cmin   0.346    1.000
5 pmn.int   3.766   10.884
4     pmn   3.955   11.431
3    lapl  13.393   38.708
1     apl  19.187   55.454

$`Wide-tall Matrix`
     test elapsed relative
2    cmin   5.591    1.000
5 pmn.int  39.466    7.059
4     pmn  40.265    7.202
1     apl  67.151   12.011
3    lapl 158.035   28.266

$`Tiny Sq Matrix`
     test elapsed relative
2    cmin   0.011    1.000
5 pmn.int   0.135   12.273
4     pmn   0.178   16.182
1     apl   0.202   18.364
3    lapl   0.269   24.455

Previous comment 2013-10-09:
FYI, since matrixStats v0.8.7 (2013-07-28), colMins() is roughly twice as fast as before. The reason is that the function previously utilized colRanges(), which explains the "surprisingly slow" results reported here. Same speed is seen for colMaxs(), rowMins() and rowMaxs().

like image 6
HenrikB Avatar answered Nov 09 '22 00:11

HenrikB