pull out p-values and r-squared from a linear regression

Tags:

r

People also ask

What is the relationship between R-squared and p-value in a regression?

There is no established association/relationship between p-value and R-square. This all depends on the data (i.e.; contextual). R-square value tells you how much variation is explained by your model. So 0.1 R-square means that your model explains 10% of variation within the data.

How do you find p-value from R-squared?

For ordinary regression with n data, F=(n−2)R2/(1−R2). Its p-value will be the p-value for the slope. Therefore if you have ever used a p-value for b in ordinary regression, you have used a p-value for R2.

How do you find R-squared from a regression line?

R 2 = 1 − sum squared regression (SSR) total sum of squares (SST) , = 1 − ∑ ( y i − y i ^ ) 2 ∑ ( y i − y ¯ ) 2 . The sum squared regression is the sum of the residuals squared, and the total sum of squares is the sum of the distance the data is away from the mean all squared.

r-squared: You can return the r-squared value directly from the summary object summary(fit)$r.squared. See names(summary(fit)) for a list of all the items you can extract directly.

Model p-value: If you want to obtain the p-value of the overall regression model, this blog post outlines a function to return the p-value:

lmp <- function (modelobject) {
    if (class(modelobject) != "lm") stop("Not an object of class 'lm' ")
    f <- summary(modelobject)$fstatistic
    p <- pf(f[1],f[2],f[3],lower.tail=F)
    attributes(p) <- NULL
    return(p)
}

> lmp(fit)
[1] 1.622665e-05

In the case of a simple regression with one predictor, the model p-value and the p-value for the coefficient will be the same.

Coefficient p-values: If you have more than one predictor, then the above will return the model p-value, and the p-value for coefficients can be extracted using:

summary(fit)$coefficients[,4]

Alternatively, you can grab the p-value of coefficients from the anova(fit) object in a similar fashion to the summary object above.

Notice that summary(fit) generates an object with all the information you need. The beta, se, t and p vectors are stored in it. Get the p-values by selecting the 4th column of the coefficients matrix (stored in the summary object):

summary(fit)$coefficients[,4] 
summary(fit)$r.squared

Try str(summary(fit)) to see all the info that this object contains.

Edit: I had misread Chase's answer which basically tells you how to get to what I give here.

You can see the structure of the object returned by summary() by calling str(summary(fit)). Each piece can be accessed using $. The p-value for the F statistic is more easily had from the object returned by anova.

Concisely, you can do this:

rSquared <- summary(fit)$r.squared
pVal <- anova(fit)$'Pr(>F)'[1]

I came across this question while exploring suggested solutions for a similar problem; I presume that for future reference it may be worthwhile to update the available list of answer with a solution utilising the broom package.

Sample code

x = cumsum(c(0, runif(100, -1, +1)))
y = cumsum(c(0, runif(100, -1, +1)))
fit = lm(y ~ x)
require(broom)
glance(fit)

Results

>> glance(fit)
  r.squared adj.r.squared    sigma statistic    p.value df    logLik      AIC      BIC deviance df.residual
1 0.5442762     0.5396729 1.502943  118.2368 1.3719e-18  2 -183.4527 372.9055 380.7508 223.6251          99

Side notes

I find the glance function is useful as it neatly summarises the key values. The results are stored as a data.frame which makes further manipulation easy:

>> class(glance(fit))
[1] "data.frame"

While both of the answers above are good, the procedure for extracting parts of objects is more general.

In many cases, functions return lists, and the individual components can be accessed using str() which will print the components along with their names. You can then access them using the $ operator, i.e. myobject$componentname.

In the case of lm objects, there are a number of predefined methods one can use such as coef(), resid(), summary() etc, but you won't always be so lucky.

Related questions
                            
                                How to use R's ellipsis feature when writing your own function?
                            
                                How to assign colors to categorical variables in ggplot2 that have stable mapping?
                            
                                Mean per group in a data.frame [duplicate]
                            
                                How to select a CRAN mirror in R
                            
                                "Correct" way to specifiy optional arguments in R functions
                            
                                Load multiple packages at once
                            
                                Speed up the loop operation in R
                            
                                Numbering rows within groups in a data frame
                            
                                Why use purrr::map instead of lapply?
                            
                                remove kernel on jupyter notebook
                            
                                Plot a legend outside of the plotting area in base graphics?
                            
                                Reshaping data.frame from wide to long format
                            
                                Can dplyr package be used for conditional mutating?
                            
                                Understanding exactly when a data.table is a reference to (vs a copy of) another data.table
                            
                                Remove NA values from a vector
                            
                                Label points in geom_point
                            
                                Extract a dplyr tbl column as a vector
                            
                                Selecting only numeric columns from a data frame
                            
                                Use of ~ (tilde) in R programming Language
                            
                                Explicitly calling return in a function or not

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With