If I want to use the the boot()
function from R's boot
package for calculating the significance of the Pearson correlation coefficient between two vectors, should I do it like this:
boot(re1, cor, R = 1000)
where re1
is a two column matrix for these two observation vectors? I can't seem to get this right because cor
of these vectors is 0.8
, but the above function returns -0.2
as t0
.
The R package boot allows a user to easily generate bootstrap samples of virtually any statistic that they can calculate in R. From these samples, you can generate estimates of bias, bootstrap confidence intervals, or plots of your bootstrap replicates.
Bootstrapping is a nonparametric method which lets us compute estimated standard errors, confidence intervals and hypothesis testing. Generally bootstrapping follows the same basic steps: Resample a given data set a specified number of times. Calculate a specific statistic from each sample.
In computing, booting is the process of starting a computer as initiated via hardware such as a button or by a software command. After it is switched on, a computer's central processing unit (CPU) has no software in its main memory, so some process must load software into memory before it can be executed.
Just to emphasize the general idea on bootstrapping in R, although @caracal already answered your question through his comment. When using boot
, you need to have a data structure (usually, a matrix) that can be sampled by row. The computation of your statistic is usually done in a function that receives this data matrix and returns the statistic of interest computed after resampling. Then, you call the boot()
that takes care of applying this function to R
replicates and collecting results in a structured format. Those results can be assessed using boot.ci()
in turn.
Here are two working examples with the low birth baby
study in the MASS
package.
require(MASS)
data(birthwt)
# compute CIs for correlation between mother's weight and birth weight
cor.boot <- function(data, k) cor(data[k,])[1,2]
cor.res <- boot(data=with(birthwt, cbind(lwt, bwt)),
statistic=cor.boot, R=500)
cor.res
boot.ci(cor.res, type="bca")
# compute CI for a particular regression coefficient, e.g. bwt ~ smoke + ht
fm <- bwt ~ smoke + ht
reg.boot <- function(formula, data, k) coef(lm(formula, data[k,]))
reg.res <- boot(data=birthwt, statistic=reg.boot,
R=500, formula=fm)
boot.ci(reg.res, type="bca", index=2) # smoke
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With