Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multidimensional confidence intervals [closed]

Tags:

I have numerous tuples (par1,par2), i.e. points in a 2 dimensional parameter space obtained from repeating an experiment multiple times.

I'm looking for a possibility to calculate and visualize confidence ellipses (not sure if thats the correct term for this). Here an example plot that I found in the web to show what I mean:

enter image description here

source: blogspot.ch/2011/07/classification-and-discrimination-with.html

So in principle one has to fit a multivariate normal distribution to a 2D histogram of data points I guess. Can somebody help me with this?

like image 712
Raphael Roth Avatar asked Sep 06 '12 13:09

Raphael Roth


People also ask

Are Confidence Intervals closed?

The short answer is "Yes".

What are the limitations of confidence intervals?

Confidence limits are the numbers at the upper and lower end of a confidence interval; for example, if your mean is 7.4 with confidence limits of 5.4 and 9.4, your confidence interval is 5.4 to 9.4. Most people use 95% confidence limits, although you could use other values.

What is the purpose of a confidence region?

Key Takeaways. A confidence interval displays the probability that a parameter will fall between a pair of values around the mean. Confidence intervals measure the degree of uncertainty or certainty in a sampling method. They are also used in hypothesis testing and regression analysis.

What are the 3 confidence intervals?

There are three factors that determine the size of the confidence interval for a given confidence level. These are: sample size, percentage and population size.


2 Answers

It sounds like you just want the 2-sigma ellipse of the scatter of points?

If so, consider something like this (From some code for a paper here: https://github.com/joferkington/oost_paper_code/blob/master/error_ellipse.py):

import numpy as np

import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse

def plot_point_cov(points, nstd=2, ax=None, **kwargs):
    """
    Plots an `nstd` sigma ellipse based on the mean and covariance of a point
    "cloud" (points, an Nx2 array).

    Parameters
    ----------
        points : An Nx2 array of the data points.
        nstd : The radius of the ellipse in numbers of standard deviations.
            Defaults to 2 standard deviations.
        ax : The axis that the ellipse will be plotted on. Defaults to the 
            current axis.
        Additional keyword arguments are pass on to the ellipse patch.

    Returns
    -------
        A matplotlib ellipse artist
    """
    pos = points.mean(axis=0)
    cov = np.cov(points, rowvar=False)
    return plot_cov_ellipse(cov, pos, nstd, ax, **kwargs)

def plot_cov_ellipse(cov, pos, nstd=2, ax=None, **kwargs):
    """
    Plots an `nstd` sigma error ellipse based on the specified covariance
    matrix (`cov`). Additional keyword arguments are passed on to the 
    ellipse patch artist.

    Parameters
    ----------
        cov : The 2x2 covariance matrix to base the ellipse on
        pos : The location of the center of the ellipse. Expects a 2-element
            sequence of [x0, y0].
        nstd : The radius of the ellipse in numbers of standard deviations.
            Defaults to 2 standard deviations.
        ax : The axis that the ellipse will be plotted on. Defaults to the 
            current axis.
        Additional keyword arguments are pass on to the ellipse patch.

    Returns
    -------
        A matplotlib ellipse artist
    """
    def eigsorted(cov):
        vals, vecs = np.linalg.eigh(cov)
        order = vals.argsort()[::-1]
        return vals[order], vecs[:,order]

    if ax is None:
        ax = plt.gca()

    vals, vecs = eigsorted(cov)
    theta = np.degrees(np.arctan2(*vecs[:,0][::-1]))

    # Width and height are "full" widths, not radius
    width, height = 2 * nstd * np.sqrt(vals)
    ellip = Ellipse(xy=pos, width=width, height=height, angle=theta, **kwargs)

    ax.add_artist(ellip)
    return ellip

if __name__ == '__main__':
    #-- Example usage -----------------------
    # Generate some random, correlated data
    points = np.random.multivariate_normal(
            mean=(1,1), cov=[[0.4, 9],[9, 10]], size=1000
            )
    # Plot the raw points...
    x, y = points.T
    plt.plot(x, y, 'ro')

    # Plot a transparent 3 standard deviation covariance ellipse
    plot_point_cov(points, nstd=3, alpha=0.5, color='green')

    plt.show()

enter image description here

like image 92
Joe Kington Avatar answered Sep 24 '22 12:09

Joe Kington


Refer the post How to draw a covariance error ellipse.

Here's the python realization:

import numpy as np
from scipy.stats import norm, chi2

def cov_ellipse(cov, q=None, nsig=None, **kwargs):
    """
    Parameters
    ----------
    cov : (2, 2) array
        Covariance matrix.
    q : float, optional
        Confidence level, should be in (0, 1)
    nsig : int, optional
        Confidence level in unit of standard deviations. 
        E.g. 1 stands for 68.3% and 2 stands for 95.4%.

    Returns
    -------
    width, height, rotation :
         The lengths of two axises and the rotation angle in degree
    for the ellipse.
    """

    if q is not None:
        q = np.asarray(q)
    elif nsig is not None:
        q = 2 * norm.cdf(nsig) - 1
    else:
        raise ValueError('One of `q` and `nsig` should be specified.')
    r2 = chi2.ppf(q, 2)

    val, vec = np.linalg.eigh(cov)
    width, height = 2 * sqrt(val[:, None] * r2)
    rotation = np.degrees(arctan2(*vec[::-1, 0]))

    return width, height, rotation

The meaning of standard deviation is wrong in the answer of Joe Kington. Usually we use 1, 2 sigma for 68%, 95% confidence levels, but the 2 sigma ellipse in his answer does not contain 95% probability of the total distribution. The correct way is using a chi square distribution to esimate the ellipse size as shown in the post.

like image 20
Syrtis Major Avatar answered Sep 25 '22 12:09

Syrtis Major