I am trying to simulate a t-copula using Python, but my code yields strange results (is not well-behaving):
I followed the approach suggested by Demarta & McNeil (2004) in "The t Copula and Related Copulas", which states:
By intuition, I know that the higher the degrees of freedom parameter, the more the t copula should resemble the Gaussian one (and hence the lower the tail dependency). However, given that I sample from scipy.stats.invgamma.rvs
or alternatively from scipy.stats.chi2.rvs
, yields higher values for my parameter s
the higher my parameter df
. This does not made any sense, as I found multiple papers stating that for df
--> inf
, t-copula --> Gaussian copula.
Here is my code, what am I doing wrong? (I'm a beginner in Python fyi).
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import invgamma, chi2, t
#Define number of sampling points
n_samples = 1000
df = 10
calib_correl_matrix = np.array([[1,0.8,],[0.8,1]]) #I just took a bivariate correlation matrix here
mu = np.zeros(len(calib_correl_matrix))
s = chi2.rvs(df)
#s = invgamma.pdf(df/2,df/2)
Z = np.random.multivariate_normal(mu, calib_correl_matrix,n_samples)
X = np.sqrt(df/s)*Z #chi-square method
#X = np.sqrt(s)*Z #inverse gamma method
U = t.cdf(X,df)
My outcomes behave exactly oppisite to what I am (should be) expecting:
Higher df
create much higher tail-dependency, here also visually:
U_pd = pd.DataFrame(U)
fig = plt.gcf()
fig.set_size_inches(14.5, 10.5)
pd.plotting.scatter_matrix(U_pd, figsize=(14,10), diagonal = 'kde')
plt.show()
df=4
:
df=100
:
It gets especially worse when using the invgamma.rvs
directly, even though they should yield the same. For dfs>=30 I often receive a ValueError ("ValueError: array must not contain infs or NaNs")
Thank you very much for your help, much appreciated!
The simplest copula is the uniform density for independent draws, i.e., c(u,v) = 1, C(u,v) = uv. Two other simple copulas are M(u,v) = min(u,v) and W(u,v) = (u+v–1)+, where the “+” means “zero if negative.” A standard result, given for instance by Wang[8], is that for any copula 3 Page 4 C, W(u,v) ≤ C(u,v) ≤ M(u,v).
The t copula is the copula that underlies the multivariate Student's t distribution. Copula name. t copula. Common notation. Parameters.
In probability theory and statistics, a copula is a multivariate cumulative distribution function for which the marginal probability distribution of each variable is uniform on the interval [0, 1]. Copulas are used to describe/model the dependence (inter-correlation) between random variables.
The Gaussian (or normal) copula is the copula of the multivariate normal distribution which is defined by the following: (8) where is a joint distribution of a multi-dimensional standard normal distribution, with linear correlation coefficient , being the standard normal distribution function.
There is one obvious problem in your code. Namely, this:
s = chi2.rvs(df)
Has to be changed to something like that:
s = chi2.rvs(df, size=n_samples)[:, np.newaxis]
Otherwise the variable s
is just a single constant and your X
ends up being a sample from the multivariate normal (scaled by np.sqrt(df/s)
), rather than the t-distrubution that you need.
You most probably obtained your "tail-heavy" charts simply because you were unlucky and your sampled value of s
ended up being too small. This has nothing to do with df
, though, yet it seems that it is easier to hit the "unlucky" values when df
is smaller.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With