Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confidence regions of 1sigma for a 2D plot

I have two variables that I have plotted using matplotlib scatter function. enter image description here

I would like to show the 68% confidence region by highlighting it in the plot. I know to show it in a histogram, but I don't know how to do it for a 2D plot like this (x vs y). In my case, the x is Mass and y is Ngal Mstar+2.

An example image of what I am looking for looks like this:

Here they have showed the 68% confidence region using dark blue and 95% confidence region using light blue.

Can it be achieved using one of thescipy.stats modules?

enter image description here

like image 446
Srivatsan Avatar asked Sep 23 '14 11:09

Srivatsan


1 Answers

To plot a region between two curves, you could use pyplot.fill_between().

As for your confidence region, I was not sure what you wanted to achieve, so I exemplified with simultaneous confidence bands, by modifying the code from:

https://en.wikipedia.org/wiki/Confidence_and_prediction_bands#cite_note-2

import numpy as np
import matplotlib.pyplot as plt
import scipy.special as sp

## Sample size.
n = 50

## Predictor values.
XV = np.random.uniform(low=-4, high=4, size=n)
XV.sort()

## Design matrix.
X = np.ones((n,2))
X[:,1] = XV

## True coefficients.
beta = np.array([0, 1.], dtype=np.float64)

## True response values.
EY = np.dot(X, beta)

## Observed response values.
Y = EY + np.random.normal(size=n)*np.sqrt(20)

## Get the coefficient estimates.
u,s,vt = np.linalg.svd(X,0)
v = np.transpose(vt)
bhat = np.dot(v, np.dot(np.transpose(u), Y)/s)

## The fitted values.
Yhat = np.dot(X, bhat)

## The MSE and RMSE.
MSE = ((Y-EY)**2).sum()/(n-X.shape[1])
s = np.sqrt(MSE)

## These multipliers are used in constructing the intervals.
XtX = np.dot(np.transpose(X), X)
V = [np.dot(X[i,:], np.linalg.solve(XtX, X[i,:])) for i in range(n)]
V = np.array(V)

## The F quantile used in constructing the Scheffe interval.
QF = sp.fdtri(X.shape[1], n-X.shape[1], 0.95)
QF_2 = sp.fdtri(X.shape[1], n-X.shape[1], 0.68)

## The lower and upper bounds of the Scheffe band.
D = s*np.sqrt(X.shape[1]*QF*V)
LB,UB = Yhat-D,Yhat+D
D_2 = s*np.sqrt(X.shape[1]*QF_2*V)
LB_2,UB_2 = Yhat-D_2,Yhat+D_2


## Make the plot.
plt.clf()
plt.plot(XV, Y, 'o', ms=3, color='grey')
plt.hold(True)
a = plt.plot(XV, EY, '-', color='black', zorder = 4)

plt.fill_between(XV, LB_2, UB_2, where = UB_2 >= LB_2, facecolor='blue', alpha= 0.3, zorder = 0)
b = plt.plot(XV, LB_2, '-', color='blue', zorder=1)
plt.plot(XV, UB_2, '-', color='blue', zorder=1)

plt.fill_between(XV, LB, UB, where = UB >= LB, facecolor='blue', alpha= 0.3, zorder = 2)
b = plt.plot(XV, LB, '-', color='blue', zorder=3)
plt.plot(XV, UB, '-', color='blue', zorder=3)

d = plt.plot(XV, Yhat, '-', color='red',zorder=4)

plt.ylim([-8,8])
plt.xlim([-4,4])

plt.xlabel("X")
plt.ylabel("Y")

plt.show()

The output looks like this: enter image description here

like image 88
snake_charmer Avatar answered Nov 02 '22 12:11

snake_charmer