Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simulation of t copula in Python

I am trying to simulate a t-copula using Python, but my code yields strange results (is not well-behaving):

I followed the approach suggested by Demarta & McNeil (2004) in "The t Copula and Related Copulas", which states:

t copula simulation

By intuition, I know that the higher the degrees of freedom parameter, the more the t copula should resemble the Gaussian one (and hence the lower the tail dependency). However, given that I sample from scipy.stats.invgamma.rvs or alternatively from scipy.stats.chi2.rvs, yields higher values for my parameter s the higher my parameter df. This does not made any sense, as I found multiple papers stating that for df--> inf, t-copula --> Gaussian copula.

Here is my code, what am I doing wrong? (I'm a beginner in Python fyi).

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import invgamma, chi2, t

#Define number of sampling points
n_samples = 1000
df = 10

calib_correl_matrix = np.array([[1,0.8,],[0.8,1]]) #I just took a bivariate correlation matrix here
mu = np.zeros(len(calib_correl_matrix))
s = chi2.rvs(df)
#s = invgamma.pdf(df/2,df/2) 
Z = np.random.multivariate_normal(mu, calib_correl_matrix,n_samples)
X = np.sqrt(df/s)*Z #chi-square method
#X = np.sqrt(s)*Z #inverse gamma method

U = t.cdf(X,df)

My outcomes behave exactly oppisite to what I am (should be) expecting: Higher df create much higher tail-dependency, here also visually:

 U_pd = pd.DataFrame(U)
 fig = plt.gcf()
 fig.set_size_inches(14.5, 10.5)
 pd.plotting.scatter_matrix(U_pd, figsize=(14,10), diagonal = 'kde')
 plt.show()

df=4: scatter_plot

df=100: enter image description here

It gets especially worse when using the invgamma.rvs directly, even though they should yield the same. For dfs>=30 I often receive a ValueError ("ValueError: array must not contain infs or NaNs")

Thank you very much for your help, much appreciated!

like image 889
rhonsprudel Avatar asked Jul 26 '18 10:07

rhonsprudel


People also ask

How do you calculate copulas?

The simplest copula is the uniform density for independent draws, i.e., c(u,v) = 1, C(u,v) = uv. Two other simple copulas are M(u,v) = min(u,v) and W(u,v) = (u+v–1)+, where the “+” means “zero if negative.” A standard result, given for instance by Wang[8], is that for any copula 3 Page 4 C, W(u,v) ≤ C(u,v) ≤ M(u,v).

What is Student t copula?

The t copula is the copula that underlies the multivariate Student's t distribution. Copula name. t copula. Common notation. Parameters.

What is a copula distribution?

In probability theory and statistics, a copula is a multivariate cumulative distribution function for which the marginal probability distribution of each variable is uniform on the interval [0, 1]. Copulas are used to describe/model the dependence (inter-correlation) between random variables.

What is a normal copula?

The Gaussian (or normal) copula is the copula of the multivariate normal distribution which is defined by the following: (8) where is a joint distribution of a multi-dimensional standard normal distribution, with linear correlation coefficient , being the standard normal distribution function.


1 Answers

There is one obvious problem in your code. Namely, this:

s = chi2.rvs(df)

Has to be changed to something like that:

s = chi2.rvs(df, size=n_samples)[:, np.newaxis]

Otherwise the variable s is just a single constant and your X ends up being a sample from the multivariate normal (scaled by np.sqrt(df/s)), rather than the t-distrubution that you need.

You most probably obtained your "tail-heavy" charts simply because you were unlucky and your sampled value of s ended up being too small. This has nothing to do with df, though, yet it seems that it is easier to hit the "unlucky" values when df is smaller.

like image 179
KT. Avatar answered Sep 27 '22 21:09

KT.