Best way in R to pick which level is the base category for a factor in an lm regression

Tags:

Suppose I want to run a regression using lm and a factor as a right hand side variable. What is the best way to choose which level in the factor is the base category (the one that is excluded to avoid multicollinearity). Note that I am not interested in excluding the intercept because I have many factors.

I would also like a formula-based solution, not one that acts on the data.frame directly, although if you think you have a really good solution for that, please post it as well.

My solution is:

base_cat <- function(x) c(x,1:(x-1),(x+1):100) 
a_reg <- lm(y ~ x1 + x2 + factor(x3, levels=base_cat(30)) #suppose that x3 has draws from the integers 1 to 100.

The left out category by lm is the first level in the factor so this just reorders the levels so that the one specified in base_cat() is the first one, and puts the rest after.

Any other ideas?

386

asked Oct 19 '11 21:10

Xu Wang

1 Answers

The function relevel does precisely this. You pass it an unordered factor and the name of the reference level and it returns a factor with that level as the first one.

190

answered Nov 04 '22 08:11

joran

Related questions
                            
                                rollapply for large data using sparklyr
                            
                                R plotly hover label text alignment
                            
                                R: fetching pdf documents from Companies House API
                            
                                Can you have multiple plans using R package drake?
                            
                                Consistently center ggplot title across PANEL not PLOT
                            
                                Inconsistent predictions from predict.gbm()
                            
                                using external classes with Shiny, R and futures
                            
                                Impute missing data with mean by group
                            
                                R Markdown document with html/docx output, using LaTeX package bbm?
                            
                                multiple users changing reactive values in R shiny
                            
                                Efficient way to drop rows with overlapping times
                            
                                Is there a pandas equivalent to the tidyr nest function?
                            
                                Sorting named numeric vectors in Rcpp
                            
                                Increase time before tidyverse API OAuth token expires when using bigquery?
                            
                                here() issue in R scripts
                            
                                Applying pnorm to columns of a data frame
                            
                                How can I superimpose modified loess lines on a ggplot2 qplot?
                            
                                How to check if a CRAN mirror is outdated?
                            
                                Converting coefficient names to a formula in R
                            
                                Order bars within each factor using ggplot2

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Best way in R to pick which level is the base category for a factor in an lm regression

Tags:

r

r-factor

lm

Xu Wang

People also ask

1 Answers

joran

Recent Activity

Donate For Us