Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MATLAB: Fast creation of random symmetric Matrix with fixed degree (sum of rows)

I am searching for a method to create, in a fast way a random matrix A with the follwing properties:

  • A = transpose(A)
  • A(i,i) = 0 for all i
  • A(i,j) >= 0 for all i, j
  • sum(A) =~ degree; the sum of rows are randomly distributed by a distribution I want to specify (here =~ means approximate equality).

The distribution degree comes from a matrix orig, specifically degree=sum(orig), thus I know that matrices with this distribution exist.

For example: orig=[0 12 7 5; 12 0 1 9; 7 1 0 3; 5 9 3 0]

orig =
 0    12     7     5
12     0     1     9
 7     1     0     3
 5     9     3     0

 sum(orig)=[24 22 11 17];

Now one possible matrix A=[0 11 5 8, 11 0 4 7, 5 4 0 2, 8 7 2 0] is

A = 
 0    11     5     8
11     0     4     7
 5     4     0     2
 8     7     2     0

with sum(A)=[24 22 11 17].

I am trying this for quite some time, but unfortunatly my two ideas didn't work:


version 1:

I switch Nswitch times two random elements: A(k1,k3)--; A(k1,k4)++; A(k2,k3)++; A(k2,k4)--; (the transposed elements aswell).

Unfortunatly, Nswitch = log(E)*E (with E=sum(sum(nn))) in order that the Matrices are very uncorrelated. As my E > 5.000.000, this is not feasible (in particular, as I need at least 10 of such matrices).


version 2:

I create the matrix according to the distribution from scratch. The idea is, to fill every row i with degree(i) numbers, based on the distribution of degree:

nn=orig;
nnR=zeros(size(nn));
for i=1:length(nn)
    degree=sum(nn);
    howmany=degree(i);
    degree(i)=0;
    full=rld_cumsum(degree,1:length(degree));
    rr=randi(length(full),[1,howmany]);
    ff=full(rr);
    xx=i*ones([1,length(ff)]);
    nnR = nnR + accumarray([xx(:),ff(:)],1,size(nnR));
end
A=nnR;

However, while sum(A')=degree, sum(A) systematically deviates from degree, and I am not able to find the reason for that.

Small deviations from degree are fine of course, but there seem to be systmatical deviations in particulat of the matrices contain in some places large numbers.


I would be very happy if somebody could either show me a fast method for version1, or a reason for the systematic deviation of the distribution in version 2, or a method to create such matrices in a different way. Thank you!


Edit:

This is the problem in matsmath's proposed solution: Imagine you have the matrix:

orig = 
 0    12     3     1
12     0     1     9
 3     1     0     3
 1     9     3     0

with r(i)=[16 22 7 13].

  • Step 1: r(1)=16, my random integer partition is p(i)=[0 7 3 6].
  • Step 2: Check that all p(i)<=r(i), which is the case.
  • Step 3:

My random matrix starts looks like

A = 
 0    7     3     6
 7    0     .     .
 3    .     0     .
 6    .     .     0

with the new row sum vector rnew=[r(2)-p(2),...,r(n)-p(n)]=[15 4 7]

Second iteration (here the problem occures):

  • Step 1: rnew(1)=15, my random integer partition is p(i)=[0 A B]: rnew(1)=15=A+B.
  • Step 2: Check that all p(i)<=rnew(i), which gives A<=4, B<=7. So A+B<=11, but A+B has to be 15. contradiction :-/

Edit2:

This is the code representing (to the best of my knowledge) the solution posted by David Eisenstat:

orig=[0 12 3 1; 12 0 1 9; 3 1 0 3; 1 9 3 0];
w=[2.2406 4.6334 0.8174 1.6902];
xfull=zeros(4);

for ii=1:1000
    rndmat=[poissrnd(w(1),1,4); poissrnd(w(2),1,4); poissrnd(w(3),1,4); poissrnd(w(4),1,4)];
    kkk=rndmat.*(ones(4)-eye(4)); % remove diagonal
    hhh=sum(sum(orig))/sum(sum(kkk))*kkk; % normalisation
    xfull=xfull+hhh;
end

xf=xfull/ii;
disp(sum(orig)); % gives [16    22     7    13]
disp(sum(xf));   % gives [14.8337    9.6171   18.0627   15.4865] (obvious systematic problem)
disp(sum(xf'))   % gives [13.5230   28.8452    4.9635   10.6683] (which is also systematically different from [16, 22, 7, 13]
like image 698
Mario Krenn Avatar asked Apr 19 '16 18:04

Mario Krenn


1 Answers

Since it's enough to approximately preserve the degree sequence, let me propose a random distribution where each entry above the diagonal is chosen according to a Poisson distribution. My intuition is that we want to find weights w_i such that the i,j entry for i != j has mean w_i*w_j (all of the diagonal entries are zero). This gives us a nonlinear system of equations:

for all i, (sum_{j != i} w_i*w_j) = d_i,

where d_i is the degree of i. Equivalently,

for all i, w_i * (sum_j w_j) - w_i^2 = d_i.

The latter can be solved by applying Newton's method as described below from a starting solution of w_i = d_i / sqrt(sum_j d_j).

Once we have the w_is, we can sample repeatedly using poissrnd to generate samples of multiple Poisson distributions at once.


(If I have time, I'll try implementing this in numpy.)

The Jacobian matrix of the equation system for a 4 by 4 problem is

(w_2 + w_3 + w_4) w_1               w_1               w_1
w_2               (w_1 + w_3 + w_4) w_2               w_2
w_3               w_3               (w_1 + w_2 + w_4) w_3
w_4               w_4               w_4               (w_1 + w_2 + w_3).

In general, let A be a diagonal matrix where A_{i,i} = sum_j w_j - 2*w_i. Let u = [w_1, ..., w_n]' and v = [1, ..., 1]'. The Jacobian can be written J = A + u*v'. The inverse is given by the Sherman--Morrison formula

                              A^-1*u*v'*A^-1
J^-1 = (A + u*v')^-1 = A^-1 - -------------- .
                              1 + v'*A^-1*u

For the Newton step, we need to compute J^-1*y for some given y. This can be done straightforwardly in time O(n) using the above equation. I'll add more detail when I get the chance.

like image 56
David Eisenstat Avatar answered Oct 15 '22 07:10

David Eisenstat