I am searching for a method to create, in a fast way a random matrix A
with the follwing properties:
A = transpose(A)
A(i,i) = 0
for all iA(i,j) >= 0
for all i, jsum(A) =~ degree
; the sum of rows are randomly distributed by a distribution I want to specify (here =~
means approximate equality).The distribution degree
comes from a matrix orig
, specifically degree=sum(orig)
, thus I know that matrices with this distribution exist.
For example: orig=[0 12 7 5; 12 0 1 9; 7 1 0 3; 5 9 3 0]
orig =
0 12 7 5
12 0 1 9
7 1 0 3
5 9 3 0
sum(orig)=[24 22 11 17];
Now one possible matrix A=[0 11 5 8, 11 0 4 7, 5 4 0 2, 8 7 2 0]
is
A =
0 11 5 8
11 0 4 7
5 4 0 2
8 7 2 0
with sum(A)=[24 22 11 17]
.
I am trying this for quite some time, but unfortunatly my two ideas didn't work:
version 1:
I switch Nswitch
times two random elements: A(k1,k3)--; A(k1,k4)++; A(k2,k3)++; A(k2,k4)--;
(the transposed elements aswell).
Unfortunatly, Nswitch = log(E)*E
(with E=sum(sum(nn))
) in order that the Matrices are very uncorrelated. As my E > 5.000.000
, this is not feasible (in particular, as I need at least 10 of such matrices).
version 2:
I create the matrix according to the distribution from scratch. The idea is, to fill every row i
with degree(i)
numbers, based on the distribution of degree
:
nn=orig;
nnR=zeros(size(nn));
for i=1:length(nn)
degree=sum(nn);
howmany=degree(i);
degree(i)=0;
full=rld_cumsum(degree,1:length(degree));
rr=randi(length(full),[1,howmany]);
ff=full(rr);
xx=i*ones([1,length(ff)]);
nnR = nnR + accumarray([xx(:),ff(:)],1,size(nnR));
end
A=nnR;
However, while sum(A')=degree
, sum(A)
systematically deviates from degree
, and I am not able to find the reason for that.
Small deviations from degree
are fine of course, but there seem to be systmatical deviations in particulat of the matrices contain in some places large numbers.
I would be very happy if somebody could either show me a fast method for version1, or a reason for the systematic deviation of the distribution in version 2, or a method to create such matrices in a different way. Thank you!
Edit:
This is the problem in matsmath's proposed solution: Imagine you have the matrix:
orig =
0 12 3 1
12 0 1 9
3 1 0 3
1 9 3 0
with r(i)=[16 22 7 13]
.
My random matrix starts looks like
A =
0 7 3 6
7 0 . .
3 . 0 .
6 . . 0
with the new row sum vector rnew=[r(2)-p(2),...,r(n)-p(n)]=[15 4 7]
Second iteration (here the problem occures):
Edit2:
This is the code representing (to the best of my knowledge) the solution posted by David Eisenstat:
orig=[0 12 3 1; 12 0 1 9; 3 1 0 3; 1 9 3 0];
w=[2.2406 4.6334 0.8174 1.6902];
xfull=zeros(4);
for ii=1:1000
rndmat=[poissrnd(w(1),1,4); poissrnd(w(2),1,4); poissrnd(w(3),1,4); poissrnd(w(4),1,4)];
kkk=rndmat.*(ones(4)-eye(4)); % remove diagonal
hhh=sum(sum(orig))/sum(sum(kkk))*kkk; % normalisation
xfull=xfull+hhh;
end
xf=xfull/ii;
disp(sum(orig)); % gives [16 22 7 13]
disp(sum(xf)); % gives [14.8337 9.6171 18.0627 15.4865] (obvious systematic problem)
disp(sum(xf')) % gives [13.5230 28.8452 4.9635 10.6683] (which is also systematically different from [16, 22, 7, 13]
Since it's enough to approximately preserve the degree sequence, let me propose a random distribution where each entry above the diagonal is chosen according to a Poisson distribution. My intuition is that we want to find weights w_i
such that the i,j
entry for i != j
has mean w_i*w_j
(all of the diagonal entries are zero). This gives us a nonlinear system of equations:
for all i, (sum_{j != i} w_i*w_j) = d_i,
where d_i
is the degree of i
. Equivalently,
for all i, w_i * (sum_j w_j) - w_i^2 = d_i.
The latter can be solved by applying Newton's method as described below from a starting solution of w_i = d_i / sqrt(sum_j d_j)
.
Once we have the w_i
s, we can sample repeatedly using poissrnd
to generate samples of multiple Poisson distributions at once.
(If I have time, I'll try implementing this in numpy.)
The Jacobian matrix of the equation system for a 4 by 4 problem is
(w_2 + w_3 + w_4) w_1 w_1 w_1
w_2 (w_1 + w_3 + w_4) w_2 w_2
w_3 w_3 (w_1 + w_2 + w_4) w_3
w_4 w_4 w_4 (w_1 + w_2 + w_3).
In general, let A
be a diagonal matrix where A_{i,i} = sum_j w_j - 2*w_i
. Let u = [w_1, ..., w_n]'
and v = [1, ..., 1]'
. The Jacobian can be written J = A + u*v'
. The inverse is given by the Sherman--Morrison formula
A^-1*u*v'*A^-1
J^-1 = (A + u*v')^-1 = A^-1 - -------------- .
1 + v'*A^-1*u
For the Newton step, we need to compute J^-1*y
for some given y
. This can be done straightforwardly in time O(n)
using the above equation. I'll add more detail when I get the chance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With