extracting standardized coefficients from lm in R

Tags:

My apologies for the dumb question...but I can't seem to find a simple solution

I want to extract the standardized coefficients from a fitted linear model (in R) there must be a simple way or function that does that. can you tell me what is it?

EDIT (following some of the comments below): I should have probably provided more contextual information about my question. I was teaching an introductory R workshop for a bunch of psychologists. For them, a linear model without the ability to get standardized coefficients is as if you didn't run the model at all (ok, this is a bit of an exaggeration, but you get the point). When we've done some regressions this was their first question, which (my bad) I didn't anticipate (I'm not a psychologist). Of course I can program this myself, and of course I can look for packages that do it for me. But at the same time, I do think that this is kind of a basic and common required feature of linear models, that on the spot, I thought there should be a basic function that does it without a need to install more and more packages (which is perceived as a difficulty for beginners). So I asked (and this was also an opportunity to show them how to get help when they need it).

My apologies for those who think I asked a stupid question, and my many thanks for those who took the time to answer it.

970

asked Jun 19 '14 11:06

amit

2 Answers

There is a convenience function in the QuantPsyc package for that, called lm.beta. However, I think the easiest way is to just standardize your variables. The coefficients will then automatically be the standardized "beta"-coefficients (i.e. coefficients in terms of standard deviations).

For instance,

 lm(scale(your.y) ~ scale(your.x), data=your.Data)

will give you the standardized coefficient.

Are they really the same? The following illustrates that both are identical:

library("QuantPsyc") mod <- lm(weight ~ height, data=women) coef_lmbeta <- lm.beta(mod)  coef_lmbeta > height    0.9955   mod2 <- lm(scale(weight) ~ scale(height), data=women) coef_scale <- coef(mod2)[2]  coef_scale > scale(height)    0.9955   all.equal(coef_lmbeta, coef_scale, check.attributes=F) [1] TRUE

which shows that both are identical, as they should be.

How to avoid clumsy variable names? In case you don't want to deal with these clumsy variable names such as scale(height), one option is to standardize the variables outside the lm call in the dataset itself. For instance,

women2 <- lapply(women, scale) # standardizes all variables  mod3 <- lm(weight ~ height, data=women2) coef_alt <- coef(mod3)[2] coef_alt > height    0.9955   all.equal(coef_lmbeta, coef_alt) [1] TRUE

How do I standardize multiple variables conveniently? In the likely event that you don't want to standardize all variables in your dataset, you could pick out all that occur in your formula. For instance, referring to the mtcars-dataset now (since women only contains height and weight):

Say the following is the regression model I want to estimate:

 modelformula <- mpg ~ cyl + disp + hp + drat + qsec

We can use the fact that all.vars gives me a vector of the variable names.

 all.vars(modelformula)  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "qsec"

We can use this to subset the dataset accordingly. For instance,

mycars <- lapply(mtcars[, all.vars(modelformula)], scale)

will give me a dataset in which all variables have been standardized. Linear regressions using mycars will now give standardized betas. Please make sure that standardizing all these variables makes sense, though!

Potential issue with only one variable: In case you model formula only contains one explanatory variable and you are working with the built-in dataframes (and not with tibbles), the following adjustment is advisable (credits go to @JerryT in the comments):

mycars <- lapply(mtcars[, all.vars(modelformula), drop=F], scale)

This is because when you extract only one column from a standard data frame, R retuns a vector instead of a dataframe. drop=F will prevent this from happening. This also won't be a problem if e.g. tibbles are used. See e.g.

class(mtcars[, "mpg"]) [1] "numeric" class(mtcars[, "mpg", drop=F]) [1] "data.frame" library(tidyverse) class(as.tibble(mtcars)[, "mpg"]) [1] "tbl_df"     "tbl"        "data.frame"

Another issue with missing values in the dataframe (credits go again to @JerryT in the comments): By default, R's lm removes all rows where at least one column is missing. scale, on the other hand, would take all values that are non-missing, even if an observation has a missing value in a different column. If you want to mimick the action of lm, you may want to first drop all rows with missing values, like so:

all_complete <- complete.cases(df) df[all_complete,]

answered Nov 04 '22 06:11

coffeinjunky

Package lm.beta has several functions to work with standardised coefficients, including lm.beta() which requires an lm object:

res <- lm(y~x) lm.beta(res)

answered Nov 04 '22 07:11

luchonacho

Related questions
                            
                                Find the index position of the first non-NA value in an R vector?
                            
                                Export data from R to Excel
                            
                                assign headers based on existing row in dataframe in R
                            
                                State name to abbreviation
                            
                                How to drop columns by name pattern in R?
                            
                                How to extract Month from date in R
                            
                                suppress NAs in paste()
                            
                                How can I drop unused levels from a data frame?
                            
                                Figure position in markdown when converting to PDF with knitr and pandoc
                            
                                rCharts nvd3 lineWithFocusChart Customization
                            
                                Is there a way to run R code from JavaScript?
                            
                                Techniques for finding near duplicate records
                            
                                Include files R?
                            
                                What is the difference between cat and print?
                            
                                When should I use setDT() instead of data.table() to create a data.table?
                            
                                R Shiny set DataTable column width
                            
                                R knitr: Possible to programmatically modify chunk labels?
                            
                                No non-missing arguments warning when using min or max in reshape2
                            
                                Get a list of the data sets in a particular package
                            
                                reshape vs. reshape2 in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

extracting standardized coefficients from lm in R

Tags:

r

beta

regression

lm

standardized

amit

People also ask

2 Answers

coffeinjunky

luchonacho

Recent Activity

Donate For Us