There has been two threads related to this issue on Stack Overflow:
The above is straightforward, but very expensive. If we have a transition matrix of order n
, then at each iteration we compute a matrix-matrix multiplication at costs O(n ^ 3)
.
Is there a more efficient way to do this? One thing that occurs to me is to use Eigen decomposition. A Markov matrix is known to:
A = E * D * E^{-1}
;The stationary distribution is the Eigen vector associated with the Eigen value of 1, i.e., the first Eigen vector.
Well, the theory is nice, but I can't get it work. Taking the matrix P
in the first linked question:
P <- structure(c(0, 0.1, 0, 0, 0, 0, 0, 0.1, 0.2, 0, 0, 0, 0, 0, 0.2,
0.3, 0, 0, 0.5, 0.4, 0.3, 0.5, 0.4, 0, 0, 0, 0, 0, 0.6, 0.4,
0.5, 0.4, 0.3, 0.2, 0, 0.6), .Dim = c(6L, 6L))
If I do:
Re(eigen(P)$vectors[, 1])
# [1] 0.4082483 0.4082483 0.4082483 0.4082483 0.4082483 0.4082483
What's going on? According to previous questions, the stationary distribution is:
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708
Well, to use Eigen decomposition, we need to work with t(P)
.
The definition of a transition probability matrix differs between probability / statistics and linear algebra. In statistics all rows of P
sum to 1, while in linear algebra, all columns of P
sum to 1. So instead of eigen(P)
, we need eigen(t(P))
:
e <- Re(eigen(t(P))$vectors[, 1])
e / sum(e)
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708
As we can see, we've only used the first Eigen vector, i.e., the Eigen vector of the largest Eigen value. Therefore, there is no need to compute all Eigen values / vectors using eigen
. The power method can be used to find an Eigen vector of the largest Eigen value. Let's implement this in R:
stydis1 <- function (A) {
n <- dim(A)[1L]
## checking
if (any(.rowSums(A, n, n) != 1))
stop (" 'A' is not a Markov matrix")
## implement power method
e <- runif (n)
oldnorm <- sqrt(c(crossprod(e)))
repeat {
e <- crossprod(A, e)
newnorm <- sqrt(c(crossprod(e)))
if (abs(newnorm / oldnorm - 1) < 1e-8) break
e <- e / newnorm
oldnorm <- newnorm
}
## rescale `e` so that it sums up to 1
c(e / sum(e))
}
stydis1 (P)
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708
And the result is correct.
In fact, we don't have to exploit Eigen decomposition. We can adjust the method used in your second linked question. Over there, we took matrix power which is expensive as you commented; but why not re-cast it into a matrix-vector multiplication?
stydis2 <- function (A) {
n <- dim(A)[1L]
## checking
if (any(.rowSums(A, n, n) != 1))
stop (" 'A' is not a Markov matrix")
## direct computation
b <- A[1, ]
oldnorm <- sqrt(c(crossprod(b)))
repeat {
b <- crossprod(A, b)
newnorm <- sqrt(c(crossprod(b)))
if (abs(newnorm / oldnorm - 1) < 1e-8) break
oldnorm <- newnorm
}
## return stationary distribution
c(b)
}
stydis2 (P)
# [1] 0.002590673 0.025906737 0.116580322 0.310880848 0.272020713 0.272020708
We start from an arbitrary initial distribution, say A[1, ]
, and iteratively apply transition matrix until the distribution converges. Again, the result is correct.
Your vector y = Re(eigen(P)$vectors[, 1])
is not a distribution (since it doesn't add up to one) and solves P'y = y
, not x'P = x
. The solution from your linked Q&A does approximately solve the latter:
x = c(0.00259067357512953, 0.0259067357512953, 0.116580310880829,
0.310880829015544, 0.272020725388601, 0.272020725388601)
all(abs(x %*% P - x) < 1e-10) # TRUE
By transposing P, you can use your eigenvalue approach:
x2 = Re(eigen(t(P))$vectors[, 1])
x2 <- x2/sum(x2)
(function(x) all(abs(x %*% P - x) < 1e-10))(
x2
) # TRUE
It's finding a different stationary vector in this instance, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With