Assuming we have a square matrix <code>M</code>, e.g., <pre class="prettyprint"><code>set.seed(1) M <- matrix(rnorm(5*5), 5, 5) > M [,1] [,2] [,3] [,4] [,5] [1,] -0.6264538 -0.8204684 1.5117812 -0.04493361 0.91897737 [2,] 0.1836433 0.4874291 0.3898432 -0.01619026 0.78213630 [3,] -0.8356286 0.7383247 -0.6212406 0.94383621 0.07456498 [4,] 1.5952808 0.5757814 -2.2146999 0.82122120 -1.98935170 [5,] 0.3295078 -0.3053884 1.1249309 0.59390132 0.61982575 </code></pre> I am wondering if there is an efficient way to find the a sub-matrix such that its determinant is the maximum among all the sub-matrices. The size of matrix should be larger than <code>1x1</code> but less than or equal to <code>5x5</code>. Some sub-matrix examples are like below <pre class="prettyprint"><code>> M[c(1,5),c(2,3)] [,1] [,2] [1,] -0.8204684 1.511781 [2,] -0.3053884 1.124931 > M[c(1,2,4),c(1,4,5)] [,1] [,2] [,3] [1,] -0.6264538 -0.04493361 0.9189774 [2,] 0.1836433 -0.01619026 0.7821363 [3,] 1.5952808 0.82122120 -1.9893517 > M[1:4,2:5] [,1] [,2] [,3] [,4] [1,] -0.8204684 1.5117812 -0.04493361 0.91897737 [2,] 0.4874291 0.3898432 -0.01619026 0.78213630 [3,] 0.7383247 -0.6212406 0.94383621 0.07456498 [4,] 0.5757814 -2.2146999 0.82122120 -1.98935170 </code></pre> I can do it in a brute-force manner, i.e., iterating through all possible sub-matrices, but I believe there must be some optimization approach can take it easier. I prefer to see solutions with <code>CVXR</code> but not sure if this optimization problem can be formulated in a convex manner. Does anyone can help? Otherwise, other optimization packages are also welcome!

I took Allan Cameron's solution and compared it with a heuristic, Threshold Accepting (TA; a variant of Simulated Annealing). Essentially, it starts with a random submatrix and then incrementally changes this submatrix, e.g. by exchanging row indices, or by adding or removing a column. A solution would be coded as a list, giving the row and column indices. So for a matrix of size 5x5, one candidate solution might be <pre class="prettyprint"><code>x ## [[1]] ## [1] TRUE FALSE FALSE TRUE FALSE ## ## [[2]] ## [1] TRUE FALSE TRUE FALSE FALSE </code></pre> Such a solution is changed through a neighbourhood function, <code>nb</code>. For instance: <pre class="prettyprint"><code>nb(x) ## [[1]] ## [1] TRUE FALSE FALSE TRUE TRUE ## ## [[2]] ## [1] TRUE FALSE TRUE TRUE FALSE ## ^^^^^ </code></pre> Given such a solution, we'll need an objective function. <pre class="prettyprint"><code>OF <- function(x, M) -det(M[x[[1]], x[[2]], drop = FALSE]) </code></pre> Since the implementation of TA I'll use minimises, I've put a minus in front of the determinant. A neighbourhood funtion <code>nb</code> could be this (though it could certainly be improved): <pre class="prettyprint"><code>nb <- function(x, ...) { if (sum(x[[1L]]) > 0L && sum(x[[1L]]) < length(x[[1L]]) && runif(1) > 0.5) { rc <- if (runif(1) > 0.5) 1 else 2 select1 <- which( x[[rc]]) select2 <- which(!x[[rc]]) size <- min(length(select1), length(select2)) size <- sample.int(size, 1) i <- select1[sample.int(length(select1), size)] j <- select2[sample.int(length(select2), size)] x[[rc]][i] <- !x[[rc]][i] x[[rc]][j] <- !x[[rc]][j] } else { i <- sample.int(length(x[[1L]]), 1) if (x[[1L]][i]) { select <- which( x[[2L]]) } else { select <- which(!x[[2L]]) } j <- select[sample.int(length(select), 1)] x[[1L]][i] <- !x[[1L]][i] x[[2L]][j] <- !x[[2L]][j] } x } </code></pre> Essentially, <code>nb</code> flips a coin and then either rearrange row or column indices (i.e. leave the size of the submatrix unchanged), or add or remove a row and a column. Finally, I create a helper function to create random initial solutions. <pre class="prettyprint"><code>x0 <- function() { k <- sample(n, 1) x1 <- logical(n) x1[sample(n, k)] <- TRUE x2 <- sample(x1) list(x1, x2) } </code></pre> We can run Threshold Accepting. I use an implemenation called <code>TAopt</code>, provided in the <code>NMOF</code> package (which I maintain). For good style, I do 10 restarts and keep the best result. <pre class="prettyprint"><code>n <- 5 M <- matrix(rnorm(n*n), n, n) max_det(M)$indices ## $rows ## [1] 1 2 4 ## ## $columns ## [1] 2 3 5 library("NMOF") restartOpt(TAopt, 10, OF, list(x0 = x0, neighbour = nb, printBar = FALSE, printDetail = FALSE, q = 0.9, nI = 1000, drop0 = TRUE), M = M, best.only = TRUE)$xbest ## [[1]] ## [1] TRUE TRUE FALSE TRUE FALSE ## ## [[2]] ## [1] FALSE TRUE TRUE FALSE TRUE </code></pre> So we get the same rows/columns. I ran the following small experiment, for increasing sizes of <code>M</code>, from to 2 to 20. Each time I compare the solution of TA with the optimal one, and I also record the times (in seconds) that TA and complete enumeration require. <pre class="prettyprint"><code>set.seed(134345) message(format(c("Size", "Optimum", "TA", "Time optimum", "Time TA"), width = 13, justify = "right")) for (i in 2:20) { n <- i M <- matrix(rnorm(n*n), n, n) t.opt <- system.time(opt <- max_det(M)$max_determinant) t.ta <- system.time(ta <- -restartOpt(TAopt, 10, OF, list(x0 = x0, neighbour = nb, printBar = FALSE, printDetail = FALSE, q = 0.9, nI = 1000, drop0 = TRUE), M = M, best.only = TRUE)$OFvalue) message(format(i, width = 13), format(round(opt, 2), width = 13), format(round(ta, 2), width = 13), format(round(t.opt[[3]],1), width = 13), format(round(t.ta[[3]],1), width = 13)) } </code></pre> The results: <pre class="prettyprint"><code> Size Optimum TA Time optimum Time TA 2 NA 1.22 0 0.7 3 1.46 1.46 0 0.6 4 2.33 2.33 0 0.7 5 11.75 11.75 0 0.7 6 9.33 9.33 0 0.7 7 9.7 9.7 0 0.7 8 126.38 126.38 0.1 0.7 9 87.5 87.5 0.3 0.7 10 198.63 198.63 1.3 0.7 11 1019.23 1019.23 5.1 0.7 12 34753.64 34753.64 20 0.7 13 16122.22 16122.22 80.2 0.7 14 168943.9 168943.9 325.3 0.7 15 274669.6 274669.6 1320.8 0.7 16 5210298 5210298 5215.4 0.7 </code></pre> So, at least until size 16x16, both methods return the same result. But TA needs a constant time of less than one second (iterations are fixed at 1000).

Since it has been four days without an answer I thought I would get the ball rolling with a working generalizable solution. Unfortunately, it falls into the brute force category, although for a 5 x 5 matrix it is fairly fast, completing in about 5ms: <pre class="prettyprint lang-r prettyprint-override"><code>max_det <- function(M) { if(diff(dim(M)) != 0) stop("max_det requires a square matrix") s <- lapply(seq(dim(M)[1])[-1], function(x) combn(seq(dim(M)[1]), x)) all_dets <- lapply(s, function(m) { apply(m, 2, function(i) apply(m, 2, function(j) det(M[j, i]))) }) i <- which.max(sapply(all_dets, max)) subs <- which(all_dets[[i]] == max(all_dets[[i]]), arr.ind = TRUE) sub_M <- M[s[[i]][,subs[1]], s[[i]][,subs[2]]] list(max_determinant = det(sub_M), indices = list(rows = s[[i]][,subs[1]], columns = s[[i]][,subs[2]]), submatrix = sub_M) } </code></pre> The format of the output is: <pre class="prettyprint lang-r prettyprint-override"><code>max_det(M) #> $max_determinant #> [1] 4.674127 #> #> $indices #> $indices$rows #> [1] 3 4 5 #> #> $indices$columns #> [1] 1 3 4 #> #> #> $submatrix #> [,1] [,2] [,3] #> [1,] -0.8356286 -0.6212406 0.9438362 #> [2,] 1.5952808 -2.2146999 0.8212212 #> [3,] 0.3295078 1.1249309 0.5939013 </code></pre> The problem of course is that this doesn't scale well to larger matrices. Although it still works: <pre class="prettyprint lang-r prettyprint-override"><code>set.seed(1) M <- matrix(rnorm(10 * 10), 10, 10) #> max_det(M) #> $max_determinant #> [1] 284.5647 #> #> $indices #> $indices$rows #> [1] 1 3 4 5 6 8 9 10 #> #> $indices$columns #> [1] 2 3 4 6 7 8 9 10 #> #> #> $submatrix #> [,1] [,2] [,3] [,4] [,5] [,6] #> [1,] 1.51178117 0.91897737 1.35867955 0.3981059 2.40161776 0.475509529 #> [2,] -0.62124058 0.07456498 0.38767161 0.3411197 0.68973936 0.610726353 #> [3,] -2.21469989 -1.98935170 -0.05380504 -1.1293631 0.02800216 -0.934097632 #> [4,] 1.12493092 0.61982575 -1.37705956 1.4330237 -0.74327321 -1.253633400 #> [5,] -0.04493361 -0.05612874 -0.41499456 1.9803999 0.18879230 0.291446236 #> [6,] 0.94383621 -1.47075238 -0.05931340 -1.0441346 1.46555486 0.001105352 #> [7,] 0.82122120 -0.47815006 1.10002537 0.5697196 0.15325334 0.074341324 #> [8,] 0.59390132 0.41794156 0.76317575 -0.1350546 2.17261167 -0.589520946 #> [,7] [,8] #> [1,] -0.5686687 -0.5425200 #> [2,] 1.1780870 1.1604026 #> [3,] -1.5235668 0.7002136 #> [4,] 0.5939462 1.5868335 #> [5,] 0.3329504 0.5584864 #> [6,] -0.3041839 -0.5732654 #> [7,] 0.3700188 -1.2246126 #> [8,] 0.2670988 -0.4734006 </code></pre> I'm getting over a second to find this solution for a 10 x 10 matrix. I think this solution is O(n!) complexity, so you can forget about it for anything even a little larger than a 10 x 10 matrix. I have a feeling there should be an O(n³) solution, but my math isn't good enough to figure it out. I guess that at least gives a benchmark for others to beat with more sophisticated methods...

Maximum determinant of sub-matrix

Tags:

optimization

r

matrix

determinants

cvxr

Assuming we have a square matrix M, e.g.,

set.seed(1)
M <- matrix(rnorm(5*5), 5, 5)

> M
           [,1]       [,2]       [,3]        [,4]        [,5]
[1,] -0.6264538 -0.8204684  1.5117812 -0.04493361  0.91897737
[2,]  0.1836433  0.4874291  0.3898432 -0.01619026  0.78213630
[3,] -0.8356286  0.7383247 -0.6212406  0.94383621  0.07456498
[4,]  1.5952808  0.5757814 -2.2146999  0.82122120 -1.98935170
[5,]  0.3295078 -0.3053884  1.1249309  0.59390132  0.61982575

I am wondering if there is an efficient way to find the a sub-matrix such that its determinant is the maximum among all the sub-matrices. The size of matrix should be larger than 1x1 but less than or equal to 5x5. Some sub-matrix examples are like below

> M[c(1,5),c(2,3)]
           [,1]     [,2]
[1,] -0.8204684 1.511781
[2,] -0.3053884 1.124931

> M[c(1,2,4),c(1,4,5)]
           [,1]        [,2]       [,3]
[1,] -0.6264538 -0.04493361  0.9189774
[2,]  0.1836433 -0.01619026  0.7821363
[3,]  1.5952808  0.82122120 -1.9893517

> M[1:4,2:5]
           [,1]       [,2]        [,3]        [,4]
[1,] -0.8204684  1.5117812 -0.04493361  0.91897737
[2,]  0.4874291  0.3898432 -0.01619026  0.78213630
[3,]  0.7383247 -0.6212406  0.94383621  0.07456498
[4,]  0.5757814 -2.2146999  0.82122120 -1.98935170

I can do it in a brute-force manner, i.e., iterating through all possible sub-matrices, but I believe there must be some optimization approach can take it easier.

I prefer to see solutions with CVXR but not sure if this optimization problem can be formulated in a convex manner. Does anyone can help? Otherwise, other optimization packages are also welcome!

785

asked Sep 28 '20 23:09

ThomasIsCoding

2 Answers

I took Allan Cameron's solution and compared it with a heuristic, Threshold Accepting (TA; a variant of Simulated Annealing). Essentially, it starts with a random submatrix and then incrementally changes this submatrix, e.g. by exchanging row indices, or by adding or removing a column.

A solution would be coded as a list, giving the row and column indices. So for a matrix of size 5x5, one candidate solution might be

x
## [[1]]
## [1]  TRUE FALSE FALSE  TRUE FALSE
## 
## [[2]]
## [1]  TRUE FALSE  TRUE FALSE FALSE

Such a solution is changed through a neighbourhood function, nb. For instance:

nb(x)
## [[1]]
## [1]  TRUE FALSE FALSE  TRUE  TRUE
## 
## [[2]]
## [1]  TRUE FALSE  TRUE  TRUE FALSE
##                       ^^^^^

Given such a solution, we'll need an objective function.

OF <- function(x, M)
    -det(M[x[[1]], x[[2]], drop = FALSE])

Since the implementation of TA I'll use minimises, I've put a minus in front of the determinant.

A neighbourhood funtion nb could be this (though it could certainly be improved):

nb <- function(x, ...) {
    if (sum(x[[1L]]) > 0L &&
        sum(x[[1L]]) < length(x[[1L]]) &&
        runif(1) > 0.5) {
        rc <- if (runif(1) > 0.5)
                  1 else 2
        select1 <- which( x[[rc]])
        select2 <- which(!x[[rc]])
        size <- min(length(select1), length(select2))
        size <- sample.int(size, 1)
        i <- select1[sample.int(length(select1), size)]
        j <- select2[sample.int(length(select2), size)]
        x[[rc]][i] <- !x[[rc]][i]
        x[[rc]][j] <- !x[[rc]][j]        
    } else {            
        i <- sample.int(length(x[[1L]]), 1)
        if (x[[1L]][i]) {
            select <- which( x[[2L]])
        } else {
            select <- which(!x[[2L]])
        }
        j <- select[sample.int(length(select), 1)]
        x[[1L]][i] <- !x[[1L]][i]
        x[[2L]][j] <- !x[[2L]][j]
    }
    x
}

Essentially, nb flips a coin and then either rearrange row or column indices (i.e. leave the size of the submatrix unchanged), or add or remove a row and a column.

Finally, I create a helper function to create random initial solutions.

x0 <- function() {
    k <- sample(n, 1)
    x1 <- logical(n)
    x1[sample(n, k)] <- TRUE
    x2 <- sample(x1)
    list(x1, x2)
}

We can run Threshold Accepting. I use an implemenation called TAopt, provided in the NMOF package (which I maintain). For good style, I do 10 restarts and keep the best result.

n <- 5
M <- matrix(rnorm(n*n), n, n)
max_det(M)$indices
## $rows
## [1] 1 2 4
## 
## $columns
## [1] 2 3 5

library("NMOF")
restartOpt(TAopt, 10, OF,
           list(x0 = x0,
                neighbour = nb,
                printBar = FALSE,
                printDetail = FALSE,
                q = 0.9,
                nI = 1000, drop0 = TRUE),
           M = M, best.only = TRUE)$xbest
## [[1]]
## [1]  TRUE  TRUE FALSE  TRUE FALSE
## 
## [[2]]
## [1] FALSE  TRUE  TRUE FALSE  TRUE

So we get the same rows/columns. I ran the following small experiment, for increasing sizes of M, from to 2 to 20. Each time I compare the solution of TA with the optimal one, and I also record the times (in seconds) that TA and complete enumeration require.

set.seed(134345)
message(format(c("Size",
        "Optimum",
        "TA",
        "Time optimum",
        "Time TA"), width = 13, justify = "right"))
for (i in 2:20) {
    n <- i
    M <- matrix(rnorm(n*n), n, n)
    t.opt <- system.time(opt <- max_det(M)$max_determinant)
    t.ta <- system.time(ta <- -restartOpt(TAopt, 10, OF,
                                    list(x0 = x0,
                                         neighbour = nb,
                                         printBar = FALSE,
                                         printDetail = FALSE,
                                         q = 0.9,
                                         nI = 1000, drop0 = TRUE),
                                    M = M, best.only = TRUE)$OFvalue)

    message(format(i, width = 13),
            format(round(opt, 2),  width = 13),
            format(round(ta, 2),  width = 13),
            format(round(t.opt[[3]],1), width = 13),
            format(round(t.ta[[3]],1), width = 13))
}

The results:

     Size      Optimum           TA Time optimum      Time TA
        2           NA         1.22            0          0.7
        3         1.46         1.46            0          0.6
        4         2.33         2.33            0          0.7
        5        11.75        11.75            0          0.7
        6         9.33         9.33            0          0.7
        7          9.7          9.7            0          0.7
        8       126.38       126.38          0.1          0.7
        9         87.5         87.5          0.3          0.7
       10       198.63       198.63          1.3          0.7
       11      1019.23      1019.23          5.1          0.7
       12     34753.64     34753.64           20          0.7
       13     16122.22     16122.22         80.2          0.7
       14     168943.9     168943.9        325.3          0.7
       15     274669.6     274669.6       1320.8          0.7
       16      5210298      5210298       5215.4          0.7

So, at least until size 16x16, both methods return the same result. But TA needs a constant time of less than one second (iterations are fixed at 1000).

131

answered Oct 19 '22 00:10

Enrico Schumann

Since it has been four days without an answer I thought I would get the ball rolling with a working generalizable solution. Unfortunately, it falls into the brute force category, although for a 5 x 5 matrix it is fairly fast, completing in about 5ms:

max_det <- function(M) {
  if(diff(dim(M)) != 0) stop("max_det requires a square matrix")
  
  s  <- lapply(seq(dim(M)[1])[-1], function(x) combn(seq(dim(M)[1]), x))
  
  all_dets <- lapply(s, function(m) {
    apply(m, 2, function(i) apply(m, 2, function(j) det(M[j, i])))
    })
  
  i <- which.max(sapply(all_dets, max))
  subs <- which(all_dets[[i]] == max(all_dets[[i]]), arr.ind = TRUE)

  sub_M <- M[s[[i]][,subs[1]], s[[i]][,subs[2]]]
  
  list(max_determinant = det(sub_M),
       indices = list(rows = s[[i]][,subs[1]], columns = s[[i]][,subs[2]]),
       submatrix = sub_M)
}

The format of the output is:

max_det(M)
#> $max_determinant
#> [1] 4.674127
#> 
#> $indices
#> $indices$rows
#> [1] 3 4 5
#> 
#> $indices$columns
#> [1] 1 3 4
#> 
#> 
#> $submatrix
#>            [,1]       [,2]      [,3]
#> [1,] -0.8356286 -0.6212406 0.9438362
#> [2,]  1.5952808 -2.2146999 0.8212212
#> [3,]  0.3295078  1.1249309 0.5939013

The problem of course is that this doesn't scale well to larger matrices. Although it still works:

set.seed(1)
M <- matrix(rnorm(10 * 10), 10, 10)

#> max_det(M)
#> $max_determinant
#> [1] 284.5647
#> 
#> $indices
#> $indices$rows
#> [1]  1  3  4  5  6  8  9 10
#> 
#> $indices$columns
#> [1]  2  3  4  6  7  8  9 10
#> 
#> 
#> $submatrix
#>             [,1]        [,2]        [,3]       [,4]        [,5]         [,6]
#> [1,]  1.51178117  0.91897737  1.35867955  0.3981059  2.40161776  0.475509529
#> [2,] -0.62124058  0.07456498  0.38767161  0.3411197  0.68973936  0.610726353
#> [3,] -2.21469989 -1.98935170 -0.05380504 -1.1293631  0.02800216 -0.934097632
#> [4,]  1.12493092  0.61982575 -1.37705956  1.4330237 -0.74327321 -1.253633400
#> [5,] -0.04493361 -0.05612874 -0.41499456  1.9803999  0.18879230  0.291446236
#> [6,]  0.94383621 -1.47075238 -0.05931340 -1.0441346  1.46555486  0.001105352
#> [7,]  0.82122120 -0.47815006  1.10002537  0.5697196  0.15325334  0.074341324
#> [8,]  0.59390132  0.41794156  0.76317575 -0.1350546  2.17261167 -0.589520946
#>            [,7]       [,8]
#> [1,] -0.5686687 -0.5425200
#> [2,]  1.1780870  1.1604026
#> [3,] -1.5235668  0.7002136
#> [4,]  0.5939462  1.5868335
#> [5,]  0.3329504  0.5584864
#> [6,] -0.3041839 -0.5732654
#> [7,]  0.3700188 -1.2246126
#> [8,]  0.2670988 -0.4734006

I'm getting over a second to find this solution for a 10 x 10 matrix.

I think this solution is O(n!) complexity, so you can forget about it for anything even a little larger than a 10 x 10 matrix. I have a feeling there should be an O(n³) solution, but my math isn't good enough to figure it out.

I guess that at least gives a benchmark for others to beat with more sophisticated methods...

answered Oct 19 '22 00:10

Allan Cameron

Related questions
                            
                                How to pass column name as parameter to function in dplyr?
                            
                                Can Rcpp expose a C++ class method taking a reference to the same class?
                            
                                Inserting title page with image in Rmarkdown to render a pdf document
                            
                                tidyr::gather vs. reshape2::melt on matrices
                            
                                Change the style of the message box generated by withProgress()
                            
                                colons equals operator in R? new syntax?
                            
                                rvest: how to find all classes used in an HTML page?
                            
                                Apply a function to all pairwise combinations of list elements in R
                            
                                Different results with randomForest() and caret's randomForest (method = "rf")
                            
                                How to add external data file into developing R package?
                            
                                Using more than one `scale_fill_` in `ggplot2`
                            
                                How do I get the R Shiny downloadHandler filename to work?
                            
                                How to perfectly align an unequal number of plots (ggplot2,gridExtra)
                            
                                Update join with multiple rows
                            
                                xaringan slide separator not separating slides
                            
                                Tidyverse: filtering n largest groups in grouped dataframe
                            
                                Define new scales axis tranform for ggplot
                            
                                Remove isolated elements of a vector
                            
                                Unload all loaded packages
                            
                                Rcpp and int64 NA value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With