I have numerous tuples (par1,par2), i.e. points in a 2 dimensional parameter space obtained from repeating an experiment multiple times.
I'm looking for a possibility to calculate and visualize confidence ellipses (not sure if thats the correct term for this). Here an example plot that I found in the web to show what I mean:
source: blogspot.ch/2011/07/classification-and-discrimination-with.html
So in principle one has to fit a multivariate normal distribution to a 2D histogram of data points I guess. Can somebody help me with this?
The short answer is "Yes".
Confidence limits are the numbers at the upper and lower end of a confidence interval; for example, if your mean is 7.4 with confidence limits of 5.4 and 9.4, your confidence interval is 5.4 to 9.4. Most people use 95% confidence limits, although you could use other values.
Key Takeaways. A confidence interval displays the probability that a parameter will fall between a pair of values around the mean. Confidence intervals measure the degree of uncertainty or certainty in a sampling method. They are also used in hypothesis testing and regression analysis.
There are three factors that determine the size of the confidence interval for a given confidence level. These are: sample size, percentage and population size.
It sounds like you just want the 2-sigma ellipse of the scatter of points?
If so, consider something like this (From some code for a paper here: https://github.com/joferkington/oost_paper_code/blob/master/error_ellipse.py):
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
def plot_point_cov(points, nstd=2, ax=None, **kwargs):
"""
Plots an `nstd` sigma ellipse based on the mean and covariance of a point
"cloud" (points, an Nx2 array).
Parameters
----------
points : An Nx2 array of the data points.
nstd : The radius of the ellipse in numbers of standard deviations.
Defaults to 2 standard deviations.
ax : The axis that the ellipse will be plotted on. Defaults to the
current axis.
Additional keyword arguments are pass on to the ellipse patch.
Returns
-------
A matplotlib ellipse artist
"""
pos = points.mean(axis=0)
cov = np.cov(points, rowvar=False)
return plot_cov_ellipse(cov, pos, nstd, ax, **kwargs)
def plot_cov_ellipse(cov, pos, nstd=2, ax=None, **kwargs):
"""
Plots an `nstd` sigma error ellipse based on the specified covariance
matrix (`cov`). Additional keyword arguments are passed on to the
ellipse patch artist.
Parameters
----------
cov : The 2x2 covariance matrix to base the ellipse on
pos : The location of the center of the ellipse. Expects a 2-element
sequence of [x0, y0].
nstd : The radius of the ellipse in numbers of standard deviations.
Defaults to 2 standard deviations.
ax : The axis that the ellipse will be plotted on. Defaults to the
current axis.
Additional keyword arguments are pass on to the ellipse patch.
Returns
-------
A matplotlib ellipse artist
"""
def eigsorted(cov):
vals, vecs = np.linalg.eigh(cov)
order = vals.argsort()[::-1]
return vals[order], vecs[:,order]
if ax is None:
ax = plt.gca()
vals, vecs = eigsorted(cov)
theta = np.degrees(np.arctan2(*vecs[:,0][::-1]))
# Width and height are "full" widths, not radius
width, height = 2 * nstd * np.sqrt(vals)
ellip = Ellipse(xy=pos, width=width, height=height, angle=theta, **kwargs)
ax.add_artist(ellip)
return ellip
if __name__ == '__main__':
#-- Example usage -----------------------
# Generate some random, correlated data
points = np.random.multivariate_normal(
mean=(1,1), cov=[[0.4, 9],[9, 10]], size=1000
)
# Plot the raw points...
x, y = points.T
plt.plot(x, y, 'ro')
# Plot a transparent 3 standard deviation covariance ellipse
plot_point_cov(points, nstd=3, alpha=0.5, color='green')
plt.show()
Refer the post How to draw a covariance error ellipse.
Here's the python realization:
import numpy as np
from scipy.stats import norm, chi2
def cov_ellipse(cov, q=None, nsig=None, **kwargs):
"""
Parameters
----------
cov : (2, 2) array
Covariance matrix.
q : float, optional
Confidence level, should be in (0, 1)
nsig : int, optional
Confidence level in unit of standard deviations.
E.g. 1 stands for 68.3% and 2 stands for 95.4%.
Returns
-------
width, height, rotation :
The lengths of two axises and the rotation angle in degree
for the ellipse.
"""
if q is not None:
q = np.asarray(q)
elif nsig is not None:
q = 2 * norm.cdf(nsig) - 1
else:
raise ValueError('One of `q` and `nsig` should be specified.')
r2 = chi2.ppf(q, 2)
val, vec = np.linalg.eigh(cov)
width, height = 2 * sqrt(val[:, None] * r2)
rotation = np.degrees(arctan2(*vec[::-1, 0]))
return width, height, rotation
The meaning of standard deviation is wrong in the answer of Joe Kington. Usually we use 1, 2 sigma for 68%, 95% confidence levels, but the 2 sigma ellipse in his answer does not contain 95% probability of the total distribution. The correct way is using a chi square distribution to esimate the ellipse size as shown in the post.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With