Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do my numbers not match, multivariate t distribution in R mvtnorm

I was trying to program the algorithm for the cdf for the multivariate t-distribution following Genz and Bretz, The reference package in R is mvtnorm.

When I was testing my function, I found that my numbers don't match up. In the following example, adjusted from the mvtnorm help, the multivariate t random variable has independent components. So the integral should be just the product of 3 independent probabilities

> lower <- -1
> upper <- 3
> df <- 4
> corr <- diag(3)
> delta <- rep(0, 3)
> pmvt(lower=lower, upper=upper, delta=delta, df=df, corr=corr)
[1] 0.5300413
attr(,"error")
[1] 4.321136e-05
attr(,"msg")
[1] "Normal Completion"

The reported error is 4e-5, the error compared to the product of independent probabilities

> (pt(upper, df) - pt(lower, df))**3
[1] 0.4988254

is

0.5300413 - 0.4988254 = 0.0312159

I'm getting discrepancies in my own code compared to R mvtnorm for various examples in about the same range.

I'm mostly a beginner in R. So, what am I doing wrong or what is wrong?

(I'm not signed up on a R-help mailing list, so I try here.)

UPDATE: As pchalasani explained, my statistics was wrong, the bug in my own code was in some helper function not in the t distribution code. A good way of seeing that being uncorrelated does not imply independence, is looking at the conditional distribution. Here are the column frequencies %*100 for independent a bivariate random variable (10000 samples) for quartiles (distribution conditional on column variable).

bivariate uncorrelated normal variates

([[26, 25, 24, 23],
  [24, 23, 24, 25],
  [24, 27, 24, 24],
  [24, 23, 26, 25]])

bivariate uncorrelated t variates

([[29, 20, 22, 29],
  [20, 31, 28, 21],
  [20, 29, 29, 20],
  [29, 18, 18, 29]])

The distribution in the first and last column is very different from the middle columns. (Sorry, no R code, since I don't know how to do this quickly with R.)

like image 872
Josef Avatar asked Jan 01 '11 23:01

Josef


1 Answers

Zero Correlation does not imply independence, for jointly non-Gaussian distributed random variables!

Let me elaborate: there is no bug here. The flaw lies in your assumption that when the multivariate Student-t random variables are uncorrelated, they are also independent, which is definitely not the case: the only class of multiavariate distributions where no correlation implies independence, is the MV Gaussian distribution.

To see that two uncorrelated random variables that jointly follow a MV Student-T distribution are not independent, consider the case of n=2:

require(mvtnorm)
x <- rmvt(100000, sigma = diag(2), df=4, delta = rep(0,2) )

Now each column of x represents realizations of the two random variables. We first check that their correlation is fairly small:

> cor(x[,1], x[,2])
[1] -0.003378811

However the correlation of the squares of x[,1] and x[,2] is as high as 30.4%, i.e., definitely not zero, proving that x[,1] and x[,2] are not statistically independent:

> cor(x[,1]^2, x[,2]^2)
[1] 0.3042684
like image 96
Prasad Chalasani Avatar answered Nov 10 '22 04:11

Prasad Chalasani