Efficiently enumerate all possible matrices with given constraints

Question

Background

Assuming we have a family of matrices Ms of size n-by-n, which should meet the following requirements:

Entries of the matrix are either 0 or 1, i.e., boolean, but diagonal entries are always 0s
The matrix is symmetric, i.e., M == t(M)
There is a constant row (or equivalently, column) sum constraint p, such that all(rowSums(M)==p) == TRUE

Questions

Is there any potential features from the particular matrix structure, e.g., symmetry, boolean, or somethings else, such that we can take benefits from that to improve the efficiency of searching?
It seems that the problem can be interpreted as a graph theory way. For example, the matrix is an adjacent matrix of a graph that consists of n vertices with both indegrees and outdegrees being equal to p. This can be done by sample_degseq, but we may have to find all of its isomorphic mappings. How can we do that if we use igraph approaches?

My code so far looks like below, but it is slow when we have larger n or p (and also I am not sure if some of matrices are missed during the enumeration).

f <- function(n, p) {
    # helper function to check if requirement holds
    checker <- function(M, p, i = nrow(M) - 1) {
        rs <- rowSums(M)
        if ((i == nrow(M) - 1)) {
            return(all(rs == p))
        } else {
            return(all(rs[1:i] == p) && all(rs[-(1:i)] <= p))
        }
    }

    # start from all-0's matrix
    lst <- list(matrix(0, n, n))
    for (i in 1:(n - 1)) {
        js <- (i + 1):n
        r <- list()
        for (mat in lst) {
            k <- p - sum(mat[i, ])
            if (k == 0) {
                if (checker(mat, p, i)) {
                    r <- c(r, list(mat))
                }
            }
            if (k > 0 && length(js) >= k) {
                idx <- combn(length(js), k, \(v) js[v], simplify = FALSE)
                for (u in idx) {
                    mm <- mat
                    mm[i, u] <- 1
                    mm[u, i] <- 1
                    if (checker(mm, p, i)) {
                        r <- c(r, list(mm))
                    }
                }
            }
        }
        lst <- r
    }
    lst
}

Examples

For n <- 4 and p <- 2, we can find 3 matrices

[[1]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    1    0    0    1
[3,]    1    0    0    1
[4,]    0    1    1    0

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    1
[2,]    1    0    1    0
[3,]    0    1    0    1
[4,]    1    0    1    0

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    0    0    1    1
[2,]    0    0    1    1
[3,]    1    1    0    0
[4,]    1    1    0    0

For n <- 3 and p <- 2, we can find only 1 matrix

[[1]]
     [,1] [,2] [,3]
[1,]    0    1    1
[2,]    1    0    1
[3,]    1    1    0

Szabolcs · Accepted Answer

The question boils down to generating all realizations of a degree sequence as a simple undirected graphs. igraph presently contains the following related tools:

realize_degseq() creates a single realization (we want all).
sample_degseq() does a random sampling of realizations, and some of the methods it implements ensure uniform sampling.
is_graphical() tests if any realization exists at all. If it does, we call the degree sequence graphical. This will be the basic building block for our algorithm below.

Suppose our degree sequence is {2, 3, 1, 2}. We can draw this as follows:

enter image description here

To generate all realizations, we need to connect up the stubs in all possible ways. Some of these ways will lead to multi-edges or self-loops, i.e. a non-simple graphs. We need to discard these.

The trick to efficient generation of all simple realizations is to detect early if we'd be forced to create a non-simple graph and stop. We can do this as follows. Let us take the first vertex, which has d_1 = 2 stubs. There are n-1 = 3 other vertices we can connect these two stubs to. There are (n-1) \choose d_1 = 3 \choose 2 = 3 ways to do so, namely the following:

enter image description here

Can we reach a simple graph from all three of these configurations? Notice that after connecting all stubs of vertex 1, what remains is a degree sequence of vertices 2, 3, 4, which may or may not be graphical. We call this the reduced degree sequence. For these three cases, we have:

2 0 2
2 1 1
3 0 1

(Discarding the first vertex whose degree now became 0, and no longer needs to be considered.)

Only the second of these is graphical, so we only need to continue the construction along this path.

> is_graphical(c(2,0,2))
[1] FALSE
> is_graphical(c(2,1,1))
[1] TRUE
> is_graphical(c(3,0,1))
[1] FALSE

Taking then the second remaining degree sequence, we can recursively continue the construction in the same manner, obtaining the single possible realization for our starting degree sequence:

enter image description here

Expressing this recursive constructions in pseudocode:

# degrees is a vector of degrees
# vertices is a vector of corresponding vertex names
# generate(degrees, vertices) returns all realizations 
# of the given degrees as simple graphs
generate(degrees, vertices):
    RESULT = {} # empty list
    d1 = degrees[1]  # first degree
    v1 = vertices[1] # first vertex
    drest = degrees[2:]  # rest of degrees
    vrest = vertices[2:] # rest of vertices
    for each size d1 subset S of vrest:
        dreduced = drest
        dreduced[S] -= 1 # decrement by one elements in S to obtain the reduced degree sequence
        if is_graphical(dreduced):
            for each graph in generate(dreduced, vrest),
            add vertex v1 to it and connect it to each of S,
            then append the graph to RESULT
    return RESULT

I have an implementation of this in Mathematica, but I am not fluent enough in R to be able to take the time to implement it in that language.

I should note that there are more efficient ways to do the enumeration, but they are a bit more complicated. Here we take the first vertex, and connect all its stubs in all possible ways to the rest of vertices. Then out of these possibilities, we only consider the feasible ones, i.e. those that led to a graphical reduced degree sequence. It is possible to take just a single stub of the first vertex, connect it, and already decide if it is feasible to continue. This can be done using the star-constrained graphicality theorem from this paper.

G. Grothendieck · Answer

1) This will generate N matrices, not necessarily distinct, that satisfy the row and column totals constraints and then filter that down to the ones that satisfy the other constraints and then reduce that down to the unique ones. It will not necessarily give all possible such matrices but the code is short and maybe good enough.

set.seed(123)
n <- 4      # = no of rows = no of cols
p <- 2      # = row sums = col sums
N <- 1000   # = initial no of matrices to generate 

L <- r2dtable(N, rep(p,n), rep(p,n))
f <- function(x) max(x) == 1 && all(diag(x) == 0) && all(x == t(x))
L2 <- unique(Filter(f, L))
L2

giving:

[[1]]
     [,1] [,2] [,3] [,4]
[1,]    0    0    1    1
[2,]    0    0    1    1
[3,]    1    1    0    0
[4,]    1    1    0    0

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    1    0    0    1
[3,]    1    0    0    1
[4,]    0    1    1    0

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    1
[2,]    1    0    1    0
[3,]    0    1    0    1
[4,]    1    0    1    0

2) If A is one feasible solution, i.e. an n by n matrix satisfying all constraints then A[P, P] is too for any permutation vector P so if the hypothesis that this generates all solutions is true then find one A using (1) or other method and apply all possible permutations and take the unique elements. If this were done a smaller N in (1) could be used since we only need one solution from it. This could be optimized further since not all permutations are needed but this is simple and has short code so maybe it is sufficient.

library(RcppAlgos)

L3 <- unique(permuteGeneral(4, FUN = \(x) L2[[1]][x, x]))
L3  # same as L2 except for order

giving:

[[1]]
     [,1] [,2] [,3] [,4]
[1,]    0    0    1    1
[2,]    0    0    1    1
[3,]    1    1    0    0
[4,]    1    1    0    0

[[2]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    1
[2,]    1    0    1    0
[3,]    0    1    0    1
[4,]    1    0    1    0

[[3]]
     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    1    0    0    1
[3,]    1    0    0    1
[4,]    0    1    1    0

3) This uses integer linear programming to get the solutions. We must provide n, p (as per question) and nsolns (the number of solutions desired). It gives the same list as (1) and (2) here except possibly order.

library (lpSolve)

# inputs
n <- 4
p <- 2
nsolns <- 5 # no of solutions desired

d <- 0*diag(n)

# constraint matrix on rows (transposed)
cRows <- sapply(1:n, function(i) +(row(d) == i))

# constraints on columns (transposed)
cCols <- sapply(1:n, function(j) +(col(d) == j))

# constraint matrix to ensure symmetry (transposed)
cSym <- combn(n,2, FUN = function(ij) {
 i <- ij[1]
 j <- ij[2]
 d[i,j] <- 1
 d[j,i] <- -1
 c(d)
})

# constraint matrix on diagonal (transposed)
cDiag <- sapply(1:n, function(i) {
 d[i,i] <- 1
 c(d)
})

# combine constraints into single constraint matrix
L <- list(cRows, cCols, cDiag, cSym)
M <- t(do.call("cbind", L))

# define RHS of constraints
b <- rep(c(p,p,0,0), sapply(L, ncol))

# find solutions
res <- lp(objective.in = 0*M[1,], 
          const.mat = M, 
          const.dir = "=",
          const.rhs = b, 
          all.bin = TRUE,
          num.bin.solns = nsolns)

# rm junk at end of res$sokutions
solns.0 <- head(res$solution, nsolns * n * n)

# create 3d array of solutions
solns.a <- array(solns.0, c(n,n,nsolns))

# convert to list of solutions 
solns.L <- lapply(1:nsolns, function(i) solns.a[,,i])

# it will always return nsolns even if there are
# fewer so check for bad ones at end
solns <- Filter(function(x) !all(x == 0) &&
 all(x %in% 0:1), solns.L)

solns

Update

Have simplified (2) as per James Wood comments. Added (3).

Efficiently enumerate all possible matrices with given constraints

Tags:

performance

algorithm

r

matrix

igraph

Background

Questions

Examples

ThomasIsCoding

2 Answers

Szabolcs

Update

G. Grothendieck

Recent Activity

Donate For Us

Efficiently enumerate all possible matrices with given constraints

Tags:

performance

algorithm

r

matrix

igraph

Background

Questions

Examples

ThomasIsCoding

2 Answers

Szabolcs

Update

G. Grothendieck

Related questions

Recent Activity

Donate For Us