Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boot package in R simple assistance

If I want to use the the boot() function from R's boot package for calculating the significance of the Pearson correlation coefficient between two vectors, should I do it like this:

boot(re1, cor, R = 1000)

where re1 is a two column matrix for these two observation vectors? I can't seem to get this right because cor of these vectors is 0.8, but the above function returns -0.2 as t0.

like image 468
Fedja Blagojevic Avatar asked Oct 20 '11 10:10

Fedja Blagojevic


People also ask

What package is boot in R?

The R package boot allows a user to easily generate bootstrap samples of virtually any statistic that they can calculate in R. From these samples, you can generate estimates of bias, bootstrap confidence intervals, or plots of your bootstrap replicates.

What does bootstrapping mean in R?

Bootstrapping is a nonparametric method which lets us compute estimated standard errors, confidence intervals and hypothesis testing. Generally bootstrapping follows the same basic steps: Resample a given data set a specified number of times. Calculate a specific statistic from each sample.

What is the main purpose of the boot function?

In computing, booting is the process of starting a computer as initiated via hardware such as a button or by a software command. After it is switched on, a computer's central processing unit (CPU) has no software in its main memory, so some process must load software into memory before it can be executed.


1 Answers

Just to emphasize the general idea on bootstrapping in R, although @caracal already answered your question through his comment. When using boot, you need to have a data structure (usually, a matrix) that can be sampled by row. The computation of your statistic is usually done in a function that receives this data matrix and returns the statistic of interest computed after resampling. Then, you call the boot() that takes care of applying this function to R replicates and collecting results in a structured format. Those results can be assessed using boot.ci() in turn.

Here are two working examples with the low birth baby study in the MASS package.

require(MASS)
data(birthwt)
# compute CIs for correlation between mother's weight and birth weight
cor.boot <- function(data, k) cor(data[k,])[1,2]
cor.res <- boot(data=with(birthwt, cbind(lwt, bwt)), 
                statistic=cor.boot, R=500)
cor.res
boot.ci(cor.res, type="bca")
# compute CI for a particular regression coefficient, e.g. bwt ~ smoke + ht
fm <- bwt ~ smoke + ht
reg.boot <- function(formula, data, k) coef(lm(formula, data[k,]))
reg.res <- boot(data=birthwt, statistic=reg.boot, 
                R=500, formula=fm)
boot.ci(reg.res, type="bca", index=2) # smoke
like image 193
chl Avatar answered Sep 30 '22 00:09

chl