Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate 95% confidence intervals using Bootstrap method

I'm trying to calculate the confidence interval for the mean value using the method of bootstrap in python. Let say I have a vector a with 100 entries and my aim is to calculate the mean value of these 100 values and its 95% confidence interval using bootstrap. So far I have manage to resample 1000 times from my vector using the np.random.choice function. Then for each bootstrap vector with 100 entries I calculated the mean. So now I have 1000 bootstrap mean values and a single sample mean value from my initial vector but I'm not sure how to proceed from here. How could I use these mean values to find the confidence interval for the mean value of my initial vector? I'm relatively new in python and it's the first time I came across with the method of bootstrap so any help would be much appreciated.

like image 209
Andriana Avatar asked Nov 08 '16 15:11

Andriana


People also ask

What percentile would you use for a bootstrap 95% confidence interval?

The percentile bootstrap interval is just the interval between the 100×(α2) and 100×(1-α2) percentiles of the distribution of θ estimates obtained from resampling, where θ represents a parameter of interest and α is the level of significance (e.g., α = 0.05 for 95% CIs) (Efron, 1982).

How do I calculate 95% confidence interval?

Calculating a C% confidence interval with the Normal approximation. ˉx±zs√n, where the value of z is appropriate for the confidence level. For a 95% confidence interval, we use z=1.96, while for a 90% confidence interval, for example, we use z=1.64.

How do you interpret bootstrap confidence interval?

Let's say you calculated 95% confidence interval from bootstrapped resamples. Now the interpretation is: "95% of the times, this bootstrap method accurately results in a confidence interval containing the true population parameter".


2 Answers

You could sort the array of 1000 means and use the 50th and 950th elements as the 90% bootstrap confidence interval.

Your set of 1000 means is basically a sample of the distribution of the mean estimator (the sampling distribution of the mean). So, any operation you could do on a sample from a distribution you can do here.

like image 172
Horia Coman Avatar answered Sep 30 '22 01:09

Horia Coman


I have a simple statistical solution : Confidence intervals are based on the standard error. The standard error in your case is the standard deviation of your 1000 bootstrap means. Assuming a normal distribution of the sampling distribution of your parameter(mean), which should be warranted by the properties of the Central Limit Theorem, just multiply the equivalent z-score of the desired confidence interval with the standard deviation. Therefore:

lower boundary = mean of your bootstrap means - 1.96 * std. dev. of your bootstrap means

upper boundary = mean of your bootstrap means + 1.96 * std. dev. of your bootstrap means

95% of cases in a normal distribution sit within 1.96 standard deviations from the mean

hope this helps

like image 36
Bogdan Lalu Avatar answered Sep 30 '22 01:09

Bogdan Lalu