Assuming we have a square matrix M
, e.g.,
set.seed(1)
M <- matrix(rnorm(5*5), 5, 5)
> M
[,1] [,2] [,3] [,4] [,5]
[1,] -0.6264538 -0.8204684 1.5117812 -0.04493361 0.91897737
[2,] 0.1836433 0.4874291 0.3898432 -0.01619026 0.78213630
[3,] -0.8356286 0.7383247 -0.6212406 0.94383621 0.07456498
[4,] 1.5952808 0.5757814 -2.2146999 0.82122120 -1.98935170
[5,] 0.3295078 -0.3053884 1.1249309 0.59390132 0.61982575
I am wondering if there is an efficient way to find the a sub-matrix such that its determinant is the maximum among all the sub-matrices. The size of matrix should be larger than 1x1
but less than or equal to 5x5
. Some sub-matrix examples are like below
> M[c(1,5),c(2,3)]
[,1] [,2]
[1,] -0.8204684 1.511781
[2,] -0.3053884 1.124931
> M[c(1,2,4),c(1,4,5)]
[,1] [,2] [,3]
[1,] -0.6264538 -0.04493361 0.9189774
[2,] 0.1836433 -0.01619026 0.7821363
[3,] 1.5952808 0.82122120 -1.9893517
> M[1:4,2:5]
[,1] [,2] [,3] [,4]
[1,] -0.8204684 1.5117812 -0.04493361 0.91897737
[2,] 0.4874291 0.3898432 -0.01619026 0.78213630
[3,] 0.7383247 -0.6212406 0.94383621 0.07456498
[4,] 0.5757814 -2.2146999 0.82122120 -1.98935170
I can do it in a brute-force manner, i.e., iterating through all possible sub-matrices, but I believe there must be some optimization approach can take it easier.
I prefer to see solutions with CVXR
but not sure if this optimization problem can be formulated in a convex manner. Does anyone can help? Otherwise, other optimization packages are also welcome!
binary matrices having the largest possible determinant are 1, 3, 3, 60, 3600, 529200, 75600, 195955200, 13716864000, ... (OEIS A051752). -matrices having the largest possible determinant are 1, 4, 96, 384, 30720, ...
Determinants are the scalar quantities obtained by the sum of products of the elements of a square matrix and their cofactors according to a prescribed rule. They help to find the adjoint, inverse of a matrix.
Since all bracketed terms in the above are either 2 or 0, the maximal determinant occurs when b+1=c+1=2 and (a+1)(d+1)=0, i.e. when b=c=1 and a or d or both are equal to −1.
I took Allan Cameron's solution and compared it with a heuristic, Threshold Accepting (TA; a variant of Simulated Annealing). Essentially, it starts with a random submatrix and then incrementally changes this submatrix, e.g. by exchanging row indices, or by adding or removing a column.
A solution would be coded as a list, giving the row and column indices. So for a matrix of size 5x5, one candidate solution might be
x
## [[1]]
## [1] TRUE FALSE FALSE TRUE FALSE
##
## [[2]]
## [1] TRUE FALSE TRUE FALSE FALSE
Such a solution is changed through a neighbourhood function, nb
.
For instance:
nb(x)
## [[1]]
## [1] TRUE FALSE FALSE TRUE TRUE
##
## [[2]]
## [1] TRUE FALSE TRUE TRUE FALSE
## ^^^^^
Given such a solution, we'll need an objective function.
OF <- function(x, M)
-det(M[x[[1]], x[[2]], drop = FALSE])
Since the implementation of TA I'll use minimises, I've put a minus in front of the determinant.
A neighbourhood funtion nb
could be this (though it could
certainly be improved):
nb <- function(x, ...) {
if (sum(x[[1L]]) > 0L &&
sum(x[[1L]]) < length(x[[1L]]) &&
runif(1) > 0.5) {
rc <- if (runif(1) > 0.5)
1 else 2
select1 <- which( x[[rc]])
select2 <- which(!x[[rc]])
size <- min(length(select1), length(select2))
size <- sample.int(size, 1)
i <- select1[sample.int(length(select1), size)]
j <- select2[sample.int(length(select2), size)]
x[[rc]][i] <- !x[[rc]][i]
x[[rc]][j] <- !x[[rc]][j]
} else {
i <- sample.int(length(x[[1L]]), 1)
if (x[[1L]][i]) {
select <- which( x[[2L]])
} else {
select <- which(!x[[2L]])
}
j <- select[sample.int(length(select), 1)]
x[[1L]][i] <- !x[[1L]][i]
x[[2L]][j] <- !x[[2L]][j]
}
x
}
Essentially, nb
flips a coin and then either rearrange row or column indices (i.e. leave the size of the submatrix unchanged), or add or remove a row and a column.
Finally, I create a helper function to create random initial solutions.
x0 <- function() {
k <- sample(n, 1)
x1 <- logical(n)
x1[sample(n, k)] <- TRUE
x2 <- sample(x1)
list(x1, x2)
}
We can run Threshold Accepting. I use an implemenation called TAopt
, provided in the NMOF
package (which I maintain). For good style, I do 10 restarts and keep the best result.
n <- 5
M <- matrix(rnorm(n*n), n, n)
max_det(M)$indices
## $rows
## [1] 1 2 4
##
## $columns
## [1] 2 3 5
library("NMOF")
restartOpt(TAopt, 10, OF,
list(x0 = x0,
neighbour = nb,
printBar = FALSE,
printDetail = FALSE,
q = 0.9,
nI = 1000, drop0 = TRUE),
M = M, best.only = TRUE)$xbest
## [[1]]
## [1] TRUE TRUE FALSE TRUE FALSE
##
## [[2]]
## [1] FALSE TRUE TRUE FALSE TRUE
So we get the same rows/columns. I ran the following small experiment, for increasing sizes of M
, from to 2 to 20. Each time I compare the solution of TA with the optimal one, and I also record the times (in seconds) that TA and complete enumeration require.
set.seed(134345)
message(format(c("Size",
"Optimum",
"TA",
"Time optimum",
"Time TA"), width = 13, justify = "right"))
for (i in 2:20) {
n <- i
M <- matrix(rnorm(n*n), n, n)
t.opt <- system.time(opt <- max_det(M)$max_determinant)
t.ta <- system.time(ta <- -restartOpt(TAopt, 10, OF,
list(x0 = x0,
neighbour = nb,
printBar = FALSE,
printDetail = FALSE,
q = 0.9,
nI = 1000, drop0 = TRUE),
M = M, best.only = TRUE)$OFvalue)
message(format(i, width = 13),
format(round(opt, 2), width = 13),
format(round(ta, 2), width = 13),
format(round(t.opt[[3]],1), width = 13),
format(round(t.ta[[3]],1), width = 13))
}
The results:
Size Optimum TA Time optimum Time TA
2 NA 1.22 0 0.7
3 1.46 1.46 0 0.6
4 2.33 2.33 0 0.7
5 11.75 11.75 0 0.7
6 9.33 9.33 0 0.7
7 9.7 9.7 0 0.7
8 126.38 126.38 0.1 0.7
9 87.5 87.5 0.3 0.7
10 198.63 198.63 1.3 0.7
11 1019.23 1019.23 5.1 0.7
12 34753.64 34753.64 20 0.7
13 16122.22 16122.22 80.2 0.7
14 168943.9 168943.9 325.3 0.7
15 274669.6 274669.6 1320.8 0.7
16 5210298 5210298 5215.4 0.7
So, at least until size 16x16, both methods return the same result. But TA needs a constant time of less than one second (iterations are fixed at 1000).
Since it has been four days without an answer I thought I would get the ball rolling with a working generalizable solution. Unfortunately, it falls into the brute force category, although for a 5 x 5 matrix it is fairly fast, completing in about 5ms:
max_det <- function(M) {
if(diff(dim(M)) != 0) stop("max_det requires a square matrix")
s <- lapply(seq(dim(M)[1])[-1], function(x) combn(seq(dim(M)[1]), x))
all_dets <- lapply(s, function(m) {
apply(m, 2, function(i) apply(m, 2, function(j) det(M[j, i])))
})
i <- which.max(sapply(all_dets, max))
subs <- which(all_dets[[i]] == max(all_dets[[i]]), arr.ind = TRUE)
sub_M <- M[s[[i]][,subs[1]], s[[i]][,subs[2]]]
list(max_determinant = det(sub_M),
indices = list(rows = s[[i]][,subs[1]], columns = s[[i]][,subs[2]]),
submatrix = sub_M)
}
The format of the output is:
max_det(M)
#> $max_determinant
#> [1] 4.674127
#>
#> $indices
#> $indices$rows
#> [1] 3 4 5
#>
#> $indices$columns
#> [1] 1 3 4
#>
#>
#> $submatrix
#> [,1] [,2] [,3]
#> [1,] -0.8356286 -0.6212406 0.9438362
#> [2,] 1.5952808 -2.2146999 0.8212212
#> [3,] 0.3295078 1.1249309 0.5939013
The problem of course is that this doesn't scale well to larger matrices. Although it still works:
set.seed(1)
M <- matrix(rnorm(10 * 10), 10, 10)
#> max_det(M)
#> $max_determinant
#> [1] 284.5647
#>
#> $indices
#> $indices$rows
#> [1] 1 3 4 5 6 8 9 10
#>
#> $indices$columns
#> [1] 2 3 4 6 7 8 9 10
#>
#>
#> $submatrix
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 1.51178117 0.91897737 1.35867955 0.3981059 2.40161776 0.475509529
#> [2,] -0.62124058 0.07456498 0.38767161 0.3411197 0.68973936 0.610726353
#> [3,] -2.21469989 -1.98935170 -0.05380504 -1.1293631 0.02800216 -0.934097632
#> [4,] 1.12493092 0.61982575 -1.37705956 1.4330237 -0.74327321 -1.253633400
#> [5,] -0.04493361 -0.05612874 -0.41499456 1.9803999 0.18879230 0.291446236
#> [6,] 0.94383621 -1.47075238 -0.05931340 -1.0441346 1.46555486 0.001105352
#> [7,] 0.82122120 -0.47815006 1.10002537 0.5697196 0.15325334 0.074341324
#> [8,] 0.59390132 0.41794156 0.76317575 -0.1350546 2.17261167 -0.589520946
#> [,7] [,8]
#> [1,] -0.5686687 -0.5425200
#> [2,] 1.1780870 1.1604026
#> [3,] -1.5235668 0.7002136
#> [4,] 0.5939462 1.5868335
#> [5,] 0.3329504 0.5584864
#> [6,] -0.3041839 -0.5732654
#> [7,] 0.3700188 -1.2246126
#> [8,] 0.2670988 -0.4734006
I'm getting over a second to find this solution for a 10 x 10 matrix.
I think this solution is O(n!) complexity, so you can forget about it for anything even a little larger than a 10 x 10 matrix. I have a feeling there should be an O(n³) solution, but my math isn't good enough to figure it out.
I guess that at least gives a benchmark for others to beat with more sophisticated methods...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With