Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multivariate Normal CDF in Python using scipy

In order to calculate the CDF of a multivariate normal, I followed this example (for the univariate case) but cannot interpret the output produced by scipy:

from scipy.stats import norm
import numpy as np
mean = np.array([1,5])
covariance = np.matrix([[1, 0.3 ],[0.3, 1]])
distribution = norm(loc=mean,scale = covariance)
print distribution.cdf(np.array([2,4]))

The output produced is:

[[  8.41344746e-01   4.29060333e-04]
 [  9.99570940e-01   1.58655254e-01]]

If the joint CDF is defined as:

P (X1 ≤ x1, . . . ,Xn ≤ xn)

then the expected output should be a real number between 0 and 1.

like image 390
statBeginner Avatar asked May 31 '15 17:05

statBeginner


People also ask

What is CDF in Scipy?

A cumulative distribution function (CDF) tells us the probability that a random variable takes on a value less than or equal to some value. This tutorial explains how to calculate and plot values for the normal CDF in Python.

What is PPF in Scipy stats?

ppf: percent point function (or inverse cumulative distribution function) ppf returns the value x of the variable that has a given cumulative distribution probability (cdf). Thus, given the cdf(x) of a x value, ppf returns the value x itself, therefore, operating as the inverse of cdf.

What is Norm PPF in Python?

The method norm. ppf() takes a percentage and returns a standard deviation multiplier for what value that percentage occurs at. It is equivalent to a, 'One-tail test' on the density plot.

What does Scipy stats Norm do?

The scipy. stats. norm represents the random variable that is normally continuous. It has different kinds of functions for normal distribution like CDF, PDF, median, etc.


1 Answers

After searching a lot, I think this blog entry by Noah H. Silbert describes the only readymade code from a standard library that can be used for computing the cdf for a multivariate normal in Python. Scipy has a way to do it but as mentioned in the blog, it is difficult to find. The approach is based on a paper by Alan Genz’s.

From the blog, this is how it works.

from scipy.stats import mvn
import numpy as np
low = np.array([-10, -10])
upp = np.array([.1, -.2])
mu = np.array([-.3, .17])
S = np.array([[1.2,.35],[.35,2.1]])
p,i = mvn.mvnun(low,upp,mu,S)
print p

0.2881578675080012
like image 120
statBeginner Avatar answered Oct 07 '22 21:10

statBeginner